Whispered Speech Conversion Based on the Inversion of Mel Frequency Cepstral Coefficient Features
A conversion method based on the inversion of Mel frequency cepstral coefficient (MFCC) features was proposed to convert whispered speech into normal speech. First, the MFCC features of whispered speech and normal speech were extracted and a matching relation between the MFCC feature parameters of w...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-02-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/15/2/68 |
_version_ | 1797483510129229824 |
---|---|
author | Qiang Zhu Zhong Wang Yunfeng Dou Jian Zhou |
author_facet | Qiang Zhu Zhong Wang Yunfeng Dou Jian Zhou |
author_sort | Qiang Zhu |
collection | DOAJ |
description | A conversion method based on the inversion of Mel frequency cepstral coefficient (MFCC) features was proposed to convert whispered speech into normal speech. First, the MFCC features of whispered speech and normal speech were extracted and a matching relation between the MFCC feature parameters of whispered speech and normal speech was developed through the Gaussian mixture model (GMM). Then, the MFCC feature parameters of normal speech corresponding to whispered speech were obtained based on the GMM and, finally, whispered speech was converted into normal speech through the inversion of MFCC features. The experimental results showed that the cepstral distortion (CD) of the normal speech converted by the proposed method was 21% less than that of the normal speech converted by the linear predictive coefficient (LPC) features, the mean opinion score (MOS) was 3.56, and a satisfactory outcome in both intelligibility and sound quality was achieved. |
first_indexed | 2024-03-09T22:48:04Z |
format | Article |
id | doaj.art-0b75b1d0d98342de857df9bfd72bf7fa |
institution | Directory Open Access Journal |
issn | 1999-4893 |
language | English |
last_indexed | 2024-03-09T22:48:04Z |
publishDate | 2022-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Algorithms |
spelling | doaj.art-0b75b1d0d98342de857df9bfd72bf7fa2023-11-23T18:24:31ZengMDPI AGAlgorithms1999-48932022-02-011526810.3390/a15020068Whispered Speech Conversion Based on the Inversion of Mel Frequency Cepstral Coefficient FeaturesQiang Zhu0Zhong Wang1Yunfeng Dou2Jian Zhou3School of Computer Science and Technology, Hefei Normal University, Hefei 230601, ChinaSchool of Computer Science and Technology, Hefei Normal University, Hefei 230601, ChinaSchool of Computer Science and Technology, Anhui University, Hefei 230601, ChinaSchool of Computer Science and Technology, Anhui University, Hefei 230601, ChinaA conversion method based on the inversion of Mel frequency cepstral coefficient (MFCC) features was proposed to convert whispered speech into normal speech. First, the MFCC features of whispered speech and normal speech were extracted and a matching relation between the MFCC feature parameters of whispered speech and normal speech was developed through the Gaussian mixture model (GMM). Then, the MFCC feature parameters of normal speech corresponding to whispered speech were obtained based on the GMM and, finally, whispered speech was converted into normal speech through the inversion of MFCC features. The experimental results showed that the cepstral distortion (CD) of the normal speech converted by the proposed method was 21% less than that of the normal speech converted by the linear predictive coefficient (LPC) features, the mean opinion score (MOS) was 3.56, and a satisfactory outcome in both intelligibility and sound quality was achieved.https://www.mdpi.com/1999-4893/15/2/68whispered speech conversionMFCC feature inversionGaussian mixture modelcepstral distortion |
spellingShingle | Qiang Zhu Zhong Wang Yunfeng Dou Jian Zhou Whispered Speech Conversion Based on the Inversion of Mel Frequency Cepstral Coefficient Features Algorithms whispered speech conversion MFCC feature inversion Gaussian mixture model cepstral distortion |
title | Whispered Speech Conversion Based on the Inversion of Mel Frequency Cepstral Coefficient Features |
title_full | Whispered Speech Conversion Based on the Inversion of Mel Frequency Cepstral Coefficient Features |
title_fullStr | Whispered Speech Conversion Based on the Inversion of Mel Frequency Cepstral Coefficient Features |
title_full_unstemmed | Whispered Speech Conversion Based on the Inversion of Mel Frequency Cepstral Coefficient Features |
title_short | Whispered Speech Conversion Based on the Inversion of Mel Frequency Cepstral Coefficient Features |
title_sort | whispered speech conversion based on the inversion of mel frequency cepstral coefficient features |
topic | whispered speech conversion MFCC feature inversion Gaussian mixture model cepstral distortion |
url | https://www.mdpi.com/1999-4893/15/2/68 |
work_keys_str_mv | AT qiangzhu whisperedspeechconversionbasedontheinversionofmelfrequencycepstralcoefficientfeatures AT zhongwang whisperedspeechconversionbasedontheinversionofmelfrequencycepstralcoefficientfeatures AT yunfengdou whisperedspeechconversionbasedontheinversionofmelfrequencycepstralcoefficientfeatures AT jianzhou whisperedspeechconversionbasedontheinversionofmelfrequencycepstralcoefficientfeatures |