Research on Speech Under Stress Based on Glottal Source Using a Physical Speech Production Model

Speech recognition accuracy is severely reduced by the variability caused by stress. Considering the fact that speech under stress is caused by the physiological changes of the vocal folds in the physiological system whose vibration behavior is reflected by glottal flow, this paper presents a method...

Full description

Bibliographic Details
Main Authors: Xiao Yao, Ning Xu, Xiaofeng Liu, Aimin Jiang, Xuewu Zhang
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8421212/
Description
Summary:Speech recognition accuracy is severely reduced by the variability caused by stress. Considering the fact that speech under stress is caused by the physiological changes of the vocal folds in the physiological system whose vibration behavior is reflected by glottal flow, this paper presents a method of study on speech under stress based on a physical speech production model and characteristics of glottal flow. The physical model is used to model glottal aerodynamics in the vocal system to represent speech production. The relationship between physical parameters and glottal flow parameters is explored based on the physical model, and the glottal source and physical model are linked. Through studying on the glottal and physical parameters for the neutral and for the speech under stress, features for speech under stress characterizing the vocal folds, vortex-flow interaction, and shape of glottal flow are compared with those of neutral speech. The relations between the proposed parameters and stress-speech production mechanism are discussed. Experiments show that physical parameters representing the stiffness and viscosity of vocal folds, subglottal pressure, and laryngeal ventricle strongly influence the glottal flow. The relations for physical parameters, glottal parameters, and stress production are revealed, and theoretical and experimental bases are provided for stress detection and classification in speech recognition system.
ISSN:2169-3536