Local Multi-Head Channel Self-Attention for Facial Expression Recognition
Since the Transformer architecture was introduced in 2017, there has been many attempts to bring the <i>self-attention</i> paradigm in the field of computer vision. In this paper, we propose <i>LHC</i>: Local multi-Head Channel <i>self-attention</i>, a novel <i...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-09-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/13/9/419 |
_version_ | 1797486882572992512 |
---|---|
author | Roberto Pecoraro Valerio Basile Viviana Bono |
author_facet | Roberto Pecoraro Valerio Basile Viviana Bono |
author_sort | Roberto Pecoraro |
collection | DOAJ |
description | Since the Transformer architecture was introduced in 2017, there has been many attempts to bring the <i>self-attention</i> paradigm in the field of computer vision. In this paper, we propose <i>LHC</i>: Local multi-Head Channel <i>self-attention</i>, a novel <i>self-attention</i> module that can be easily integrated into virtually every convolutional neural network, and that is specifically designed for computer vision, with a specific focus on facial expression recognition. <i>LHC</i> is based on two main ideas: first, we think that in computer vision, the best way to leverage the <i>self-attention</i> paradigm is the channel-wise application instead of the more well explored spatial attention. Secondly, a local approach has the potential to better overcome the limitations of convolution than global attention, at least in those scenarios where images have a constant general structure, as in facial expression recognition. <i>LHC-Net</i> achieves a new state-of-the-art in the FER2013 dataset, with a significantly lower complexity and impact on the “host” architecture in terms of computational cost when compared with the previous state-of-the-art. |
first_indexed | 2024-03-09T23:39:40Z |
format | Article |
id | doaj.art-25867e4563574adebf8a08ae8d65c3c4 |
institution | Directory Open Access Journal |
issn | 2078-2489 |
language | English |
last_indexed | 2024-03-09T23:39:40Z |
publishDate | 2022-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Information |
spelling | doaj.art-25867e4563574adebf8a08ae8d65c3c42023-11-23T16:53:14ZengMDPI AGInformation2078-24892022-09-0113941910.3390/info13090419Local Multi-Head Channel Self-Attention for Facial Expression RecognitionRoberto Pecoraro0Valerio Basile1Viviana Bono2Department of Computer Science, University of Turin, C.so Svizzera 185, 10147 Turin, ItalyDepartment of Computer Science, University of Turin, C.so Svizzera 185, 10147 Turin, ItalyDepartment of Computer Science, University of Turin, C.so Svizzera 185, 10147 Turin, ItalySince the Transformer architecture was introduced in 2017, there has been many attempts to bring the <i>self-attention</i> paradigm in the field of computer vision. In this paper, we propose <i>LHC</i>: Local multi-Head Channel <i>self-attention</i>, a novel <i>self-attention</i> module that can be easily integrated into virtually every convolutional neural network, and that is specifically designed for computer vision, with a specific focus on facial expression recognition. <i>LHC</i> is based on two main ideas: first, we think that in computer vision, the best way to leverage the <i>self-attention</i> paradigm is the channel-wise application instead of the more well explored spatial attention. Secondly, a local approach has the potential to better overcome the limitations of convolution than global attention, at least in those scenarios where images have a constant general structure, as in facial expression recognition. <i>LHC-Net</i> achieves a new state-of-the-art in the FER2013 dataset, with a significantly lower complexity and impact on the “host” architecture in terms of computational cost when compared with the previous state-of-the-art.https://www.mdpi.com/2078-2489/13/9/419<i>self-attention</i>facial expression recognitionconvolutional neural networkscomputer vision |
spellingShingle | Roberto Pecoraro Valerio Basile Viviana Bono Local Multi-Head Channel Self-Attention for Facial Expression Recognition Information <i>self-attention</i> facial expression recognition convolutional neural networks computer vision |
title | Local Multi-Head Channel Self-Attention for Facial Expression Recognition |
title_full | Local Multi-Head Channel Self-Attention for Facial Expression Recognition |
title_fullStr | Local Multi-Head Channel Self-Attention for Facial Expression Recognition |
title_full_unstemmed | Local Multi-Head Channel Self-Attention for Facial Expression Recognition |
title_short | Local Multi-Head Channel Self-Attention for Facial Expression Recognition |
title_sort | local multi head channel self attention for facial expression recognition |
topic | <i>self-attention</i> facial expression recognition convolutional neural networks computer vision |
url | https://www.mdpi.com/2078-2489/13/9/419 |
work_keys_str_mv | AT robertopecoraro localmultiheadchannelselfattentionforfacialexpressionrecognition AT valeriobasile localmultiheadchannelselfattentionforfacialexpressionrecognition AT vivianabono localmultiheadchannelselfattentionforfacialexpressionrecognition |