Facial Expression Recognition Based on Squeeze Vision Transformer

In recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has many li...

Full description

Bibliographic Details
Main Authors: Sangwon Kim, Jaeyeal Nam, Byoung Chul Ko
Format: Article
Language:English
Published: MDPI AG 2022-05-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/10/3729
_version_ 1797495756593037312
author Sangwon Kim
Jaeyeal Nam
Byoung Chul Ko
author_facet Sangwon Kim
Jaeyeal Nam
Byoung Chul Ko
author_sort Sangwon Kim
collection DOAJ
description In recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has many limitations in facial expression recognition (FER), which requires the detection of subtle changes in expression, because it can lose the local features of the image. Therefore, in this paper, we propose Squeeze ViT, a method for reducing the computational complexity by reducing the number of feature dimensions while increasing the FER performance by concurrently combining global and local features. To measure the FER performance of Squeeze ViT, experiments were conducted on lab-controlled FER datasets and a wild FER dataset. Through comparative experiments with previous state-of-the-art approaches, we proved that the proposed method achieves an excellent performance on both types of datasets.
first_indexed 2024-03-10T01:54:11Z
format Article
id doaj.art-9d0358b219a54bbc9a27b968d7de9173
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-10T01:54:11Z
publishDate 2022-05-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-9d0358b219a54bbc9a27b968d7de91732023-11-23T13:00:06ZengMDPI AGSensors1424-82202022-05-012210372910.3390/s22103729Facial Expression Recognition Based on Squeeze Vision TransformerSangwon Kim0Jaeyeal Nam1Byoung Chul Ko2Department of Computer Engineering, Keimyung University, Daegu 42601, KoreaDepartment of Computer Engineering, Keimyung University, Daegu 42601, KoreaDepartment of Computer Engineering, Keimyung University, Daegu 42601, KoreaIn recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has many limitations in facial expression recognition (FER), which requires the detection of subtle changes in expression, because it can lose the local features of the image. Therefore, in this paper, we propose Squeeze ViT, a method for reducing the computational complexity by reducing the number of feature dimensions while increasing the FER performance by concurrently combining global and local features. To measure the FER performance of Squeeze ViT, experiments were conducted on lab-controlled FER datasets and a wild FER dataset. Through comparative experiments with previous state-of-the-art approaches, we proved that the proposed method achieves an excellent performance on both types of datasets.https://www.mdpi.com/1424-8220/22/10/3729facial expression recognitionvision transformersqueeze modulevisual tokenlandmark token
spellingShingle Sangwon Kim
Jaeyeal Nam
Byoung Chul Ko
Facial Expression Recognition Based on Squeeze Vision Transformer
Sensors
facial expression recognition
vision transformer
squeeze module
visual token
landmark token
title Facial Expression Recognition Based on Squeeze Vision Transformer
title_full Facial Expression Recognition Based on Squeeze Vision Transformer
title_fullStr Facial Expression Recognition Based on Squeeze Vision Transformer
title_full_unstemmed Facial Expression Recognition Based on Squeeze Vision Transformer
title_short Facial Expression Recognition Based on Squeeze Vision Transformer
title_sort facial expression recognition based on squeeze vision transformer
topic facial expression recognition
vision transformer
squeeze module
visual token
landmark token
url https://www.mdpi.com/1424-8220/22/10/3729
work_keys_str_mv AT sangwonkim facialexpressionrecognitionbasedonsqueezevisiontransformer
AT jaeyealnam facialexpressionrecognitionbasedonsqueezevisiontransformer
AT byoungchulko facialexpressionrecognitionbasedonsqueezevisiontransformer