Attention‐based hierarchical pyramid feature fusion structure for efficient face recognition

Abstract Deep convolutional neural networks (CNN) have become the main method for face recognition (FR). To deploy deep CNN models on embedded and mobile devices, several lightweight FR models have been proposed. However, multi‐scale facial features are seldom considered in these approaches. To over...

Full description

Bibliographic Details
Main Authors: Yi Dai, Kai Sun, Wei Huang, Dawei Zhang, Gaojie Dai
Format: Article
Language:English
Published: Wiley 2023-06-01
Series:IET Image Processing
Subjects:
Online Access:https://doi.org/10.1049/ipr2.12802
Description
Summary:Abstract Deep convolutional neural networks (CNN) have become the main method for face recognition (FR). To deploy deep CNN models on embedded and mobile devices, several lightweight FR models have been proposed. However, multi‐scale facial features are seldom considered in these approaches. To overcome this limitation, an attention‐based hierarchical pyramid feature fusion (AHPF) structure was proposed in this paper. Specifically, hierarchical multi‐scale features were directly extracted from the backbone based on its pyramidal hierarchy, and the bidirectional cross‐scale connection was used to better combine the high‐level global features with low‐level local features. In addition, instead of simple concatenation or summation, an attention‐based feature fusion mechanism was used to highlight the most recognizable facial patches, and to address the unequal contribution to the output during the fusing process. Based on the AHPF structure and efficient backbones, multiple sizes of lightweight FR models were presented, called HSFNet. After an extensive experimental evaluation involving 10 mainstream benchmarks, the proposed models consistently achieved state‐of‐the‐art FR performance compared to other lightweight FR models with same level of model complexity. With only 0.659M parameters and 94.94M FLOPs, our HSFNet‐05‐M exhibited a performance competitive with recent top‐ranked FR models containing up to 4M parameters and 500M FLOPs.
ISSN:1751-9659
1751-9667