Automated Subjective Assessment of Speech Intelligibility in Various Listening Modes

In this paper, the results of automated subjective assessment of Ukrainian speech intelligibility are presented. Speech monosyllables of the consonant-vowel-consonant (CVC) type were listened in two modes: through headphones and through acoustic monitors. The assessment was carried out with the help...

Full description

Bibliographic Details
Main Authors:	Arkadii Mykolaiovych Prodeus, Kseniia Victorivna Bukhta, Pavlo Vadymovych Morozko, Oleksii Volodymyroych Serhiienko, Ihor Valeriiovych Kotvytskyi, Oleksandr Oleksandrovych Dvornyk
Format:	Article
Language:	English
Published:	Igor Sikorsky Kyiv Polytechnic Institute 2018-06-01
Series:	Mìkrosistemi, Elektronìka ta Akustika
Subjects:	automation of articulation tests speech intelligibility noise interference reverberation interference listening mode
Online Access:	http://elc.kpi.ua/article/view/130367

_version_	1818390778947305472
author	Arkadii Mykolaiovych Prodeus Kseniia Victorivna Bukhta Pavlo Vadymovych Morozko Oleksii Volodymyroych Serhiienko Ihor Valeriiovych Kotvytskyi Oleksandr Oleksandrovych Dvornyk
author_facet	Arkadii Mykolaiovych Prodeus Kseniia Victorivna Bukhta Pavlo Vadymovych Morozko Oleksii Volodymyroych Serhiienko Ihor Valeriiovych Kotvytskyi Oleksandr Oleksandrovych Dvornyk
author_sort	Arkadii Mykolaiovych Prodeus
collection	DOAJ
description	In this paper, the results of automated subjective assessment of Ukrainian speech intelligibility are presented. Speech monosyllables of the consonant-vowel-consonant (CVC) type were listened in two modes: through headphones and through acoustic monitors. The assessment was carried out with the help of specially developed software that allowed automating of articulation tests. Speech listening was done for four situations: pure language; speech distorted by noise; speech distorted by reverberation; speech distorted by the combined effect of noise and reverberation. In the first case, speech monosyllables of 3 articulation tables were listened, each of which contained 50 monosyllables. In the second case, speech distorted by the additive noise with the signal-to-noise ratios (SNR) varied in the range ‑15…+10 dB was listened. In this case, models of white, pink and brown noises were used, the masking properties of which are rather well-studied. In the third case, the reverberant speech for reverberation times in the range 0.3…2.7 s was modeled by convolution of pure speech signals with room impulse responces (RIRs) of various rooms, and in the fourth case the joint action of pink noise and reverberation was considered. It turned out that the masking ability of white noise exceeds one for brown noise for SNR less than minus 5 dB, which is not entirely consistent with preliminary predictive estimates. In addition, it turned out that listening to speech distorted by noise through acoustic monitors could lead to a significant increase in the speech intelligibility, compared to the case of listening through headphones. The analysis of possible causes of abnormal increase in speech intelligibility has been carried out. Early reflections, presence of two loudspeakers, binaural listening, psychophysical features of listeners, as well as peculiarities of software and articulatory testing organization were considered as possible reasons of the phenomenon. After correction of the software and some features of articulation tests it turned out that the results of the speech intelligibility estimation almost coincide when listening to the signals through the headphones and through acoustic monitors, if the distance between the listener and acoustic monitors does not exceed 0.6-0.8 meters. At the same time, these corrections did not differ in the behavior of the dependencies of speech intelligibility on the SNR for small (less minus 5 dB) SNR values The general conclusion may be that listening to speech signals distorted by noise and reverberation interferences, performed with the application of the proposed automated system of articulation tests, indicates the performance and high quality of the developed system. Ref. 13, fig. 7.
first_indexed	2024-12-14T05:03:03Z
format	Article
id	doaj.art-2a334c3b731844cd9ca0f84a0eaf65ce
institution	Directory Open Access Journal
issn	2523-4447 2523-4455
language	English
last_indexed	2024-12-14T05:03:03Z
publishDate	2018-06-01
publisher	Igor Sikorsky Kyiv Polytechnic Institute
record_format	Article
series	Mìkrosistemi, Elektronìka ta Akustika
spelling	doaj.art-2a334c3b731844cd9ca0f84a0eaf65ce2022-12-21T23:16:10ZengIgor Sikorsky Kyiv Polytechnic InstituteMìkrosistemi, Elektronìka ta Akustika2523-44472523-44552018-06-0123310.20535/2523-4455.2018.23.3.130367130367Automated Subjective Assessment of Speech Intelligibility in Various Listening ModesArkadii Mykolaiovych Prodeus0Kseniia Victorivna Bukhta1Pavlo Vadymovych Morozko2Oleksii Volodymyroych Serhiienko3Ihor Valeriiovych Kotvytskyi4Oleksandr Oleksandrovych Dvornyk5National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"In this paper, the results of automated subjective assessment of Ukrainian speech intelligibility are presented. Speech monosyllables of the consonant-vowel-consonant (CVC) type were listened in two modes: through headphones and through acoustic monitors. The assessment was carried out with the help of specially developed software that allowed automating of articulation tests. Speech listening was done for four situations: pure language; speech distorted by noise; speech distorted by reverberation; speech distorted by the combined effect of noise and reverberation. In the first case, speech monosyllables of 3 articulation tables were listened, each of which contained 50 monosyllables. In the second case, speech distorted by the additive noise with the signal-to-noise ratios (SNR) varied in the range ‑15…+10 dB was listened. In this case, models of white, pink and brown noises were used, the masking properties of which are rather well-studied. In the third case, the reverberant speech for reverberation times in the range 0.3…2.7 s was modeled by convolution of pure speech signals with room impulse responces (RIRs) of various rooms, and in the fourth case the joint action of pink noise and reverberation was considered. It turned out that the masking ability of white noise exceeds one for brown noise for SNR less than minus 5 dB, which is not entirely consistent with preliminary predictive estimates. In addition, it turned out that listening to speech distorted by noise through acoustic monitors could lead to a significant increase in the speech intelligibility, compared to the case of listening through headphones. The analysis of possible causes of abnormal increase in speech intelligibility has been carried out. Early reflections, presence of two loudspeakers, binaural listening, psychophysical features of listeners, as well as peculiarities of software and articulatory testing organization were considered as possible reasons of the phenomenon. After correction of the software and some features of articulation tests it turned out that the results of the speech intelligibility estimation almost coincide when listening to the signals through the headphones and through acoustic monitors, if the distance between the listener and acoustic monitors does not exceed 0.6-0.8 meters. At the same time, these corrections did not differ in the behavior of the dependencies of speech intelligibility on the SNR for small (less minus 5 dB) SNR values The general conclusion may be that listening to speech signals distorted by noise and reverberation interferences, performed with the application of the proposed automated system of articulation tests, indicates the performance and high quality of the developed system. Ref. 13, fig. 7.http://elc.kpi.ua/article/view/130367automation of articulation testsspeech intelligibilitynoise interferencereverberation interferencelistening mode
spellingShingle	Arkadii Mykolaiovych Prodeus Kseniia Victorivna Bukhta Pavlo Vadymovych Morozko Oleksii Volodymyroych Serhiienko Ihor Valeriiovych Kotvytskyi Oleksandr Oleksandrovych Dvornyk Automated Subjective Assessment of Speech Intelligibility in Various Listening Modes Mìkrosistemi, Elektronìka ta Akustika automation of articulation tests speech intelligibility noise interference reverberation interference listening mode
title	Automated Subjective Assessment of Speech Intelligibility in Various Listening Modes
title_full	Automated Subjective Assessment of Speech Intelligibility in Various Listening Modes
title_fullStr	Automated Subjective Assessment of Speech Intelligibility in Various Listening Modes
title_full_unstemmed	Automated Subjective Assessment of Speech Intelligibility in Various Listening Modes
title_short	Automated Subjective Assessment of Speech Intelligibility in Various Listening Modes
title_sort	automated subjective assessment of speech intelligibility in various listening modes
topic	automation of articulation tests speech intelligibility noise interference reverberation interference listening mode
url	http://elc.kpi.ua/article/view/130367
work_keys_str_mv	AT arkadiimykolaiovychprodeus automatedsubjectiveassessmentofspeechintelligibilityinvariouslisteningmodes AT kseniiavictorivnabukhta automatedsubjectiveassessmentofspeechintelligibilityinvariouslisteningmodes AT pavlovadymovychmorozko automatedsubjectiveassessmentofspeechintelligibilityinvariouslisteningmodes AT oleksiivolodymyroychserhiienko automatedsubjectiveassessmentofspeechintelligibilityinvariouslisteningmodes AT ihorvaleriiovychkotvytskyi automatedsubjectiveassessmentofspeechintelligibilityinvariouslisteningmodes AT oleksandroleksandrovychdvornyk automatedsubjectiveassessmentofspeechintelligibilityinvariouslisteningmodes

Automated Subjective Assessment of Speech Intelligibility in Various Listening Modes

Similar Items