Summary: | Abstract Background Pharyngeal fricative is one typical compensatory articulation error of cleft palate speech. It passively influences daily communication for people who suffer from it. The automatic detection of pharyngeal fricatives in cleft palate speech can provide information for clinical doctors and speech-language pathologists to aid in diagnosis. Results This paper proposes two features (CSIFs: correlation of signals in independent frequency bands; OSPP: octave spectrum prominent peak) to detect pharyngeal fricative speech. CSIFs feature is proposed to detect the distribution characteristics of frequency components in pharyngeal fricative speech caused by the changed place of articulation and movement of articulators. While OSPP is presented to reflect the concentration degree of prominent peak which is closely related to the place of articulation in pharyngeal fricative, both features are investigated to relate to the altered production process of pharyngeal fricative. To evaluate the capability of these two features to detect pharyngeal fricative, we collected a speech database covering all the types of initial consonants in which pharyngeal fricatives occur. In this detection task, the classifier used to discriminate pharyngeal fricative speech and normal speech is based on ensemble learning. Conclusion The detection accuracy obtained with CSIFs and OSPP features ranges from 83.5 to 84.5% and from 85 to 87%, respectively. When these two features are combined, the detection accuracy for pharyngeal fricative speech ranges from 88 to 89%, with an AUC (area under the receiver operating characteristic curve) value of 93%.
|