The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study

This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Conti...

Full description

Bibliographic Details
Main Authors: Puyang Geng, Qimeng Lu, Hong Guo, Jinhua Zeng
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/?tool=EBI
_version_ 1797852120374837248
author Puyang Geng
Qimeng Lu
Hong Guo
Jinhua Zeng
author_facet Puyang Geng
Qimeng Lu
Hong Guo
Jinhua Zeng
author_sort Puyang Geng
collection DOAJ
description This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification.
first_indexed 2024-04-09T19:27:48Z
format Article
id doaj.art-75c4deb89fd44faa8ef29f1acb5f7245
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-09T19:27:48Z
publishDate 2023-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-75c4deb89fd44faa8ef29f1acb5f72452023-04-05T05:31:48ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01183The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic studyPuyang GengQimeng LuHong GuoJinhua ZengThis study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/?tool=EBI
spellingShingle Puyang Geng
Qimeng Lu
Hong Guo
Jinhua Zeng
The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
PLoS ONE
title The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_full The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_fullStr The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_full_unstemmed The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_short The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
title_sort effects of face mask on speech production and its implication for forensic speaker identification a cross linguistic study
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/?tool=EBI
work_keys_str_mv AT puyanggeng theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT qimenglu theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT hongguo theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT jinhuazeng theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT puyanggeng effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT qimenglu effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT hongguo effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy
AT jinhuazeng effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy