Fine-Tuning Swin Transformer and Multiple Weights Optimality-Seeking for Facial Expression Recognition

Facial expression recognition plays a key role in human-computer emotional interaction. However, human faces in real environments are affected by various unfavorable factors, which will result in the reduction of expression recognition accuracy. In this paper, we proposed a novel method which combin...

Full description

Bibliographic Details
Main Authors: Hongqi Feng, Weikai Huang, Denghui Zhang, Bangze Zhang
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10018959/
Description
Summary:Facial expression recognition plays a key role in human-computer emotional interaction. However, human faces in real environments are affected by various unfavorable factors, which will result in the reduction of expression recognition accuracy. In this paper, we proposed a novel method which combines Fine-tuning Swin Transformer and Multiple Weights Optimality-seeking (FST-MWOS) to enhanced expression recognition performance. FST-MWOS mainly consists of two crucial components: Fine-tuning Swin Transformer (FST) and Multiple Weights Optimality-seeking (MWOS). FST takes Swin Transformer Large as the backbone network to obtain multiple groups of fine-tuned model weights for the homologous data domains by hyperparameters configurations, data augmentation methods, etc. In MWOS a greedy strategy was used to mine locally optimal generalizations in the optimal epoch interval of each group of fine-tuned model weights. Then, the optimality-seeking for multiple groups of locally optimal weights was utilized to obtain the global optimal solution. Experiments results on RAF-DB, FERPlus and AffectNet datasets show that the proposed FST-MWOS method outperforms various state-of-the-art methods.
ISSN:2169-3536