The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Conti...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2023-01-01
|
Series: | PLoS ONE |
Online Access: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/?tool=EBI |
_version_ | 1797852120374837248 |
---|---|
author | Puyang Geng Qimeng Lu Hong Guo Jinhua Zeng |
author_facet | Puyang Geng Qimeng Lu Hong Guo Jinhua Zeng |
author_sort | Puyang Geng |
collection | DOAJ |
description | This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification. |
first_indexed | 2024-04-09T19:27:48Z |
format | Article |
id | doaj.art-75c4deb89fd44faa8ef29f1acb5f7245 |
institution | Directory Open Access Journal |
issn | 1932-6203 |
language | English |
last_indexed | 2024-04-09T19:27:48Z |
publishDate | 2023-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj.art-75c4deb89fd44faa8ef29f1acb5f72452023-04-05T05:31:48ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01183The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic studyPuyang GengQimeng LuHong GuoJinhua ZengThis study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/?tool=EBI |
spellingShingle | Puyang Geng Qimeng Lu Hong Guo Jinhua Zeng The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study PLoS ONE |
title | The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study |
title_full | The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study |
title_fullStr | The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study |
title_full_unstemmed | The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study |
title_short | The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study |
title_sort | effects of face mask on speech production and its implication for forensic speaker identification a cross linguistic study |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10062611/?tool=EBI |
work_keys_str_mv | AT puyanggeng theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT qimenglu theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT hongguo theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT jinhuazeng theeffectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT puyanggeng effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT qimenglu effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT hongguo effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy AT jinhuazeng effectsoffacemaskonspeechproductionanditsimplicationforforensicspeakeridentificationacrosslinguisticstudy |