The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study

This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Conti...

Full description

Bibliographic Details
Main Authors:	Puyang Geng, Qimeng Lu, Hong Guo, Jinhua Zeng
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2023-01-01
Series:	PLoS ONE
Online Access:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/?tool=EBI

_version_	1797852120374837248
author	Puyang Geng Qimeng Lu Hong Guo Jinhua Zeng
author_facet	Puyang Geng Qimeng Lu Hong Guo Jinhua Zeng
author_sort	Puyang Geng
collection	DOAJ
description	This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification.
first_indexed	2024-04-09T19:27:48Z
format	Article
id	doaj.art-75c4deb89fd44faa8ef29f1acb5f7245
institution	Directory Open Access Journal
issn	1932-6203
language	English
last_indexed	2024-04-09T19:27:48Z
publishDate	2023-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj.art-75c4deb89fd44faa8ef29f1acb5f72452023-04-05T05:31:48ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01183The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic studyPuyang GengQimeng LuHong GuoJinhua ZengThis study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/?tool=EBI
spellingShingle	Puyang Geng Qimeng Lu Hong Guo Jinhua Zeng The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study PLoS ONE
title	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_full	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_fullStr	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_full_unstemmed	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_short	The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_sort	effects of face mask on speech production and its implication for forensic speaker identification a cross linguistic study
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/?tool=EBI
work_keys_str_mv	AT puyanggeng theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT qimenglu theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT hongguo theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT jinhuazeng theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT puyanggeng effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT qimenglu effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT hongguo effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT jinhuazeng effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy

The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study

Similar Items