Gender and Age Estimation Methods Based on Speech Using Deep Neural Networks

The speech signal contains a vast spectrum of information about the speaker such as speakers’ gender, age, accent, or health state. In this paper, we explored different approaches to automatic speaker’s gender classification and age estimation system using speech signals. We applied various Deep Neu...

Full description

Bibliographic Details
Main Authors: Damian Kwasny, Daria Hemmerling
Format: Article
Language:English
Published: MDPI AG 2021-07-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/14/4785
_version_ 1797526052845649920
author Damian Kwasny
Daria Hemmerling
author_facet Damian Kwasny
Daria Hemmerling
author_sort Damian Kwasny
collection DOAJ
description The speech signal contains a vast spectrum of information about the speaker such as speakers’ gender, age, accent, or health state. In this paper, we explored different approaches to automatic speaker’s gender classification and age estimation system using speech signals. We applied various Deep Neural Network-based embedder architectures such as x-vector and d-vector to age estimation and gender classification tasks. Furthermore, we have applied a transfer learning-based training scheme with pre-training the embedder network for a speaker recognition task using the Vox-Celeb1 dataset and then fine-tuning it for the joint age estimation and gender classification task. The best performing system achieves new state-of-the-art results on the age estimation task using popular TIMIT dataset with a mean absolute error (MAE) of 5.12 years for male and 5.29 years for female speakers and a root-mean square error (RMSE) of 7.24 and 8.12 years for male and female speakers, respectively, and an overall gender recognition accuracy of 99.60%.
first_indexed 2024-03-10T09:24:40Z
format Article
id doaj.art-5b1ca9f8d6004e619e3bbfa7d5d0adab
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-10T09:24:40Z
publishDate 2021-07-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-5b1ca9f8d6004e619e3bbfa7d5d0adab2023-11-22T04:56:03ZengMDPI AGSensors1424-82202021-07-012114478510.3390/s21144785Gender and Age Estimation Methods Based on Speech Using Deep Neural NetworksDamian Kwasny0Daria Hemmerling1Department of Measurement and Electronics, AGH University of Science and Technology, 30-059 Krakow, PolandDepartment of Measurement and Electronics, AGH University of Science and Technology, 30-059 Krakow, PolandThe speech signal contains a vast spectrum of information about the speaker such as speakers’ gender, age, accent, or health state. In this paper, we explored different approaches to automatic speaker’s gender classification and age estimation system using speech signals. We applied various Deep Neural Network-based embedder architectures such as x-vector and d-vector to age estimation and gender classification tasks. Furthermore, we have applied a transfer learning-based training scheme with pre-training the embedder network for a speaker recognition task using the Vox-Celeb1 dataset and then fine-tuning it for the joint age estimation and gender classification task. The best performing system achieves new state-of-the-art results on the age estimation task using popular TIMIT dataset with a mean absolute error (MAE) of 5.12 years for male and 5.29 years for female speakers and a root-mean square error (RMSE) of 7.24 and 8.12 years for male and female speakers, respectively, and an overall gender recognition accuracy of 99.60%.https://www.mdpi.com/1424-8220/21/14/4785speech processingneural networksgender classificationage estimationx-vector
spellingShingle Damian Kwasny
Daria Hemmerling
Gender and Age Estimation Methods Based on Speech Using Deep Neural Networks
Sensors
speech processing
neural networks
gender classification
age estimation
x-vector
title Gender and Age Estimation Methods Based on Speech Using Deep Neural Networks
title_full Gender and Age Estimation Methods Based on Speech Using Deep Neural Networks
title_fullStr Gender and Age Estimation Methods Based on Speech Using Deep Neural Networks
title_full_unstemmed Gender and Age Estimation Methods Based on Speech Using Deep Neural Networks
title_short Gender and Age Estimation Methods Based on Speech Using Deep Neural Networks
title_sort gender and age estimation methods based on speech using deep neural networks
topic speech processing
neural networks
gender classification
age estimation
x-vector
url https://www.mdpi.com/1424-8220/21/14/4785
work_keys_str_mv AT damiankwasny genderandageestimationmethodsbasedonspeechusingdeepneuralnetworks
AT dariahemmerling genderandageestimationmethodsbasedonspeechusingdeepneuralnetworks