Multi-resolution speech analysis for automatic speech recognition using deep neural networks: Experiments on TIMIT.

Speech Analysis for Automatic Speech Recognition (ASR) systems typically starts with a Short-Time Fourier Transform (STFT) that implies selecting a fixed point in the time-frequency resolution trade-off. This approach, combined with a Mel-frequency scaled filterbank and a Discrete Cosine Transform g...

Full description

Bibliographic Details
Main Authors: Doroteo T Toledano, María Pilar Fernández-Gallego, Alicia Lozano-Diez
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2018-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC6179252?pdf=render