Exploiting the Tail Data for Long-Tailed Face Recognition

Long-tailed distribution generally exists in large-scale face datasets, which poses challenges for learning discriminative feature in face recognition. Although a few works conduct preliminary research on this problem, the value of the tail data is still underestimated. This paper addresses the long...

Full description

Bibliographic Details
Main Authors: Song Guo, Rujie Liu, Mengjiao Wang, Meng Zhang, Shijie Nie, Septiana Lina, Narishige Abe
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9887937/
_version_ 1818061060413849600
author Song Guo
Rujie Liu
Mengjiao Wang
Meng Zhang
Shijie Nie
Septiana Lina
Narishige Abe
author_facet Song Guo
Rujie Liu
Mengjiao Wang
Meng Zhang
Shijie Nie
Septiana Lina
Narishige Abe
author_sort Song Guo
collection DOAJ
description Long-tailed distribution generally exists in large-scale face datasets, which poses challenges for learning discriminative feature in face recognition. Although a few works conduct preliminary research on this problem, the value of the tail data is still underestimated. This paper addresses the long-tailed problem from the perspective of maximally exploiting the tail data. We propose a Joint Alternating Training (JAT) framework to learn discriminative feature from both the long-tailed data and the tail data by using alternating training strategy. JAT consists of two branches: 1) the long-tailed data branch is adopted to learn the universal discrimination information from the whole long-tailed data with instance-balanced sampling. 2) the tail data branch is designed to exploit the discriminative information in the tail data with class-balanced sampling. To compensate the insufficient samples and lack of intra-class variations, we apply data augmentation (DA) to the tail data. We further propose margin-based mixup (MarginMix) for data augmentation, which can deal with the nonlinearity of margin-based softmax loss and stabilize the training process in mixup. Furthermore, we obtain the best combination of strategies (i.e., JAT+DA+ MarginMix) for long-tailed face recognition, which can maximally exploit the discriminative information in the tail data while retaining the universal discrimination learned from the long-tailed data. Extensive experiments on 8 face datasets demonstrate that our proposed methods and combination of strategies can effectively address the long-tailed problem in face recognition.
first_indexed 2024-12-10T13:42:18Z
format Article
id doaj.art-4d6ecbf621b344348230953d92f16260
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-10T13:42:18Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-4d6ecbf621b344348230953d92f162602022-12-22T01:46:37ZengIEEEIEEE Access2169-35362022-01-0110979459795310.1109/ACCESS.2022.32060409887937Exploiting the Tail Data for Long-Tailed Face RecognitionSong Guo0https://orcid.org/0000-0002-4505-4059Rujie Liu1Mengjiao Wang2Meng Zhang3https://orcid.org/0000-0003-1465-9714Shijie Nie4Septiana Lina5Narishige Abe6Fujitsu Research and Development Center Company Ltd., Beijing, ChinaFujitsu Research and Development Center Company Ltd., Beijing, ChinaFujitsu Research and Development Center Company Ltd., Beijing, ChinaFujitsu Research and Development Center Company Ltd., Beijing, ChinaFujitsu Research and Development Center Company Ltd., Beijing, ChinaFujitsu Laboratories Ltd., Kawasaki, JapanFujitsu Laboratories Ltd., Kawasaki, JapanLong-tailed distribution generally exists in large-scale face datasets, which poses challenges for learning discriminative feature in face recognition. Although a few works conduct preliminary research on this problem, the value of the tail data is still underestimated. This paper addresses the long-tailed problem from the perspective of maximally exploiting the tail data. We propose a Joint Alternating Training (JAT) framework to learn discriminative feature from both the long-tailed data and the tail data by using alternating training strategy. JAT consists of two branches: 1) the long-tailed data branch is adopted to learn the universal discrimination information from the whole long-tailed data with instance-balanced sampling. 2) the tail data branch is designed to exploit the discriminative information in the tail data with class-balanced sampling. To compensate the insufficient samples and lack of intra-class variations, we apply data augmentation (DA) to the tail data. We further propose margin-based mixup (MarginMix) for data augmentation, which can deal with the nonlinearity of margin-based softmax loss and stabilize the training process in mixup. Furthermore, we obtain the best combination of strategies (i.e., JAT+DA+ MarginMix) for long-tailed face recognition, which can maximally exploit the discriminative information in the tail data while retaining the universal discrimination learned from the long-tailed data. Extensive experiments on 8 face datasets demonstrate that our proposed methods and combination of strategies can effectively address the long-tailed problem in face recognition.https://ieeexplore.ieee.org/document/9887937/Face recognitionconvolutional neural networklong-tailed distributionmargin softmax lossdata augmentation
spellingShingle Song Guo
Rujie Liu
Mengjiao Wang
Meng Zhang
Shijie Nie
Septiana Lina
Narishige Abe
Exploiting the Tail Data for Long-Tailed Face Recognition
IEEE Access
Face recognition
convolutional neural network
long-tailed distribution
margin softmax loss
data augmentation
title Exploiting the Tail Data for Long-Tailed Face Recognition
title_full Exploiting the Tail Data for Long-Tailed Face Recognition
title_fullStr Exploiting the Tail Data for Long-Tailed Face Recognition
title_full_unstemmed Exploiting the Tail Data for Long-Tailed Face Recognition
title_short Exploiting the Tail Data for Long-Tailed Face Recognition
title_sort exploiting the tail data for long tailed face recognition
topic Face recognition
convolutional neural network
long-tailed distribution
margin softmax loss
data augmentation
url https://ieeexplore.ieee.org/document/9887937/
work_keys_str_mv AT songguo exploitingthetaildataforlongtailedfacerecognition
AT rujieliu exploitingthetaildataforlongtailedfacerecognition
AT mengjiaowang exploitingthetaildataforlongtailedfacerecognition
AT mengzhang exploitingthetaildataforlongtailedfacerecognition
AT shijienie exploitingthetaildataforlongtailedfacerecognition
AT septianalina exploitingthetaildataforlongtailedfacerecognition
AT narishigeabe exploitingthetaildataforlongtailedfacerecognition