Influence of Different Speech Representations and HMM Training Strategies on ASR Performance

This work studies the influence of various speech signal representations and speaking styles on the performance of automatic speech recognition (ASR).  The efficiency of two approaches to hidden Markov model (HMM) training are compared.Common MFCC and PLP features were exposed to two sources of dist...

Full description

Bibliographic Details
Main Authors: H. Bořil, P. Fousek
Format: Article
Language:English
Published: CTU Central Library 2006-01-01
Series:Acta Polytechnica
Subjects:
Online Access:https://ojs.cvut.cz/ojs/index.php/ap/article/view/896
_version_ 1818750932251312128
author H. Bořil
P. Fousek
author_facet H. Bořil
P. Fousek
author_sort H. Bořil
collection DOAJ
description This work studies the influence of various speech signal representations and speaking styles on the performance of automatic speech recognition (ASR).  The efficiency of two approaches to hidden Markov model (HMM) training are compared.Common MFCC and PLP features were exposed to two sources of disturbance applied to the original wide-band speech: (i) stress (Lombard effect) and (ii) transfer channel distortion (simulated telephone line). Subsequently, the efficiencies of the two training strategies were evaluated. Finally, a study of the optimal number of training iterations is introduced.
first_indexed 2024-12-18T04:27:32Z
format Article
id doaj.art-acdf94fa9c654e238565e3cfc0ee0a4e
institution Directory Open Access Journal
issn 1210-2709
1805-2363
language English
last_indexed 2024-12-18T04:27:32Z
publishDate 2006-01-01
publisher CTU Central Library
record_format Article
series Acta Polytechnica
spelling doaj.art-acdf94fa9c654e238565e3cfc0ee0a4e2022-12-21T21:21:04ZengCTU Central LibraryActa Polytechnica1210-27091805-23632006-01-01466896Influence of Different Speech Representations and HMM Training Strategies on ASR PerformanceH. BořilP. FousekThis work studies the influence of various speech signal representations and speaking styles on the performance of automatic speech recognition (ASR).  The efficiency of two approaches to hidden Markov model (HMM) training are compared.Common MFCC and PLP features were exposed to two sources of disturbance applied to the original wide-band speech: (i) stress (Lombard effect) and (ii) transfer channel distortion (simulated telephone line). Subsequently, the efficiencies of the two training strategies were evaluated. Finally, a study of the optimal number of training iterations is introduced.https://ojs.cvut.cz/ojs/index.php/ap/article/view/896PLPMFCCLombard effectCLSD’05
spellingShingle H. Bořil
P. Fousek
Influence of Different Speech Representations and HMM Training Strategies on ASR Performance
Acta Polytechnica
PLP
MFCC
Lombard effect
CLSD’05
title Influence of Different Speech Representations and HMM Training Strategies on ASR Performance
title_full Influence of Different Speech Representations and HMM Training Strategies on ASR Performance
title_fullStr Influence of Different Speech Representations and HMM Training Strategies on ASR Performance
title_full_unstemmed Influence of Different Speech Representations and HMM Training Strategies on ASR Performance
title_short Influence of Different Speech Representations and HMM Training Strategies on ASR Performance
title_sort influence of different speech representations and hmm training strategies on asr performance
topic PLP
MFCC
Lombard effect
CLSD’05
url https://ojs.cvut.cz/ojs/index.php/ap/article/view/896
work_keys_str_mv AT hboril influenceofdifferentspeechrepresentationsandhmmtrainingstrategiesonasrperformance
AT pfousek influenceofdifferentspeechrepresentationsandhmmtrainingstrategiesonasrperformance