Facial Expression Recognition Based on Squeeze Vision Transformer
In recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has many li...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-05-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/22/10/3729 |
_version_ | 1797495756593037312 |
---|---|
author | Sangwon Kim Jaeyeal Nam Byoung Chul Ko |
author_facet | Sangwon Kim Jaeyeal Nam Byoung Chul Ko |
author_sort | Sangwon Kim |
collection | DOAJ |
description | In recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has many limitations in facial expression recognition (FER), which requires the detection of subtle changes in expression, because it can lose the local features of the image. Therefore, in this paper, we propose Squeeze ViT, a method for reducing the computational complexity by reducing the number of feature dimensions while increasing the FER performance by concurrently combining global and local features. To measure the FER performance of Squeeze ViT, experiments were conducted on lab-controlled FER datasets and a wild FER dataset. Through comparative experiments with previous state-of-the-art approaches, we proved that the proposed method achieves an excellent performance on both types of datasets. |
first_indexed | 2024-03-10T01:54:11Z |
format | Article |
id | doaj.art-9d0358b219a54bbc9a27b968d7de9173 |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-03-10T01:54:11Z |
publishDate | 2022-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-9d0358b219a54bbc9a27b968d7de91732023-11-23T13:00:06ZengMDPI AGSensors1424-82202022-05-012210372910.3390/s22103729Facial Expression Recognition Based on Squeeze Vision TransformerSangwon Kim0Jaeyeal Nam1Byoung Chul Ko2Department of Computer Engineering, Keimyung University, Daegu 42601, KoreaDepartment of Computer Engineering, Keimyung University, Daegu 42601, KoreaDepartment of Computer Engineering, Keimyung University, Daegu 42601, KoreaIn recent image classification approaches, a vision transformer (ViT) has shown an excellent performance beyond that of a convolutional neural network. A ViT achieves a high classification for natural images because it properly preserves the global image features. Conversely, a ViT still has many limitations in facial expression recognition (FER), which requires the detection of subtle changes in expression, because it can lose the local features of the image. Therefore, in this paper, we propose Squeeze ViT, a method for reducing the computational complexity by reducing the number of feature dimensions while increasing the FER performance by concurrently combining global and local features. To measure the FER performance of Squeeze ViT, experiments were conducted on lab-controlled FER datasets and a wild FER dataset. Through comparative experiments with previous state-of-the-art approaches, we proved that the proposed method achieves an excellent performance on both types of datasets.https://www.mdpi.com/1424-8220/22/10/3729facial expression recognitionvision transformersqueeze modulevisual tokenlandmark token |
spellingShingle | Sangwon Kim Jaeyeal Nam Byoung Chul Ko Facial Expression Recognition Based on Squeeze Vision Transformer Sensors facial expression recognition vision transformer squeeze module visual token landmark token |
title | Facial Expression Recognition Based on Squeeze Vision Transformer |
title_full | Facial Expression Recognition Based on Squeeze Vision Transformer |
title_fullStr | Facial Expression Recognition Based on Squeeze Vision Transformer |
title_full_unstemmed | Facial Expression Recognition Based on Squeeze Vision Transformer |
title_short | Facial Expression Recognition Based on Squeeze Vision Transformer |
title_sort | facial expression recognition based on squeeze vision transformer |
topic | facial expression recognition vision transformer squeeze module visual token landmark token |
url | https://www.mdpi.com/1424-8220/22/10/3729 |
work_keys_str_mv | AT sangwonkim facialexpressionrecognitionbasedonsqueezevisiontransformer AT jaeyealnam facialexpressionrecognitionbasedonsqueezevisiontransformer AT byoungchulko facialexpressionrecognitionbasedonsqueezevisiontransformer |