Multi-resolution speech analysis for automatic speech recognition using deep neural networks: Experiments on TIMIT.
Speech Analysis for Automatic Speech Recognition (ASR) systems typically starts with a Short-Time Fourier Transform (STFT) that implies selecting a fixed point in the time-frequency resolution trade-off. This approach, combined with a Mel-frequency scaled filterbank and a Discrete Cosine Transform g...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2018-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC6179252?pdf=render |