An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.

Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based...

Full description

Bibliographic Details
Main Authors: Alicia Lozano-Diez, Ruben Zazo, Doroteo T Toledano, Joaquin Gonzalez-Rodriguez
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2017-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5552160?pdf=render
_version_ 1818559799195860992
author Alicia Lozano-Diez
Ruben Zazo
Doroteo T Toledano
Joaquin Gonzalez-Rodriguez
author_facet Alicia Lozano-Diez
Ruben Zazo
Doroteo T Toledano
Joaquin Gonzalez-Rodriguez
author_sort Alicia Lozano-Diez
collection DOAJ
description Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for the task of automatic speech recognition (ASR). This DNN aims to compress information in one of its layers, known as bottleneck (BN) layer, which is used to obtain a new frame representation of the audio signal. This representation has been proven to be useful for the task of language identification (LID). Thus, bottleneck features are used as input to the language recognition system, instead of a classical parameterization of the signal based on cepstral feature vectors such as MFCCs (Mel Frequency Cepstral Coefficients). Despite the success of this approach in language recognition, there is a lack of studies analyzing in a systematic way how the topology of the DNN influences the performance of bottleneck feature-based language recognition systems. In this work, we try to fill-in this gap, analyzing language recognition results with different topologies for the DNN used to extract the bottleneck features, comparing them and against a reference system based on a more classical cepstral representation of the input signal with a total variability model. This way, we obtain useful knowledge about how the DNN configuration influences bottleneck feature-based language recognition systems performance.
first_indexed 2024-12-14T00:30:14Z
format Article
id doaj.art-dc2eea914ea3440786006471faab3e48
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-14T00:30:14Z
publishDate 2017-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-dc2eea914ea3440786006471faab3e482022-12-21T23:24:54ZengPublic Library of Science (PLoS)PLoS ONE1932-62032017-01-01128e018258010.1371/journal.pone.0182580An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.Alicia Lozano-DiezRuben ZazoDoroteo T ToledanoJoaquin Gonzalez-RodriguezLanguage recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for the task of automatic speech recognition (ASR). This DNN aims to compress information in one of its layers, known as bottleneck (BN) layer, which is used to obtain a new frame representation of the audio signal. This representation has been proven to be useful for the task of language identification (LID). Thus, bottleneck features are used as input to the language recognition system, instead of a classical parameterization of the signal based on cepstral feature vectors such as MFCCs (Mel Frequency Cepstral Coefficients). Despite the success of this approach in language recognition, there is a lack of studies analyzing in a systematic way how the topology of the DNN influences the performance of bottleneck feature-based language recognition systems. In this work, we try to fill-in this gap, analyzing language recognition results with different topologies for the DNN used to extract the bottleneck features, comparing them and against a reference system based on a more classical cepstral representation of the input signal with a total variability model. This way, we obtain useful knowledge about how the DNN configuration influences bottleneck feature-based language recognition systems performance.http://europepmc.org/articles/PMC5552160?pdf=render
spellingShingle Alicia Lozano-Diez
Ruben Zazo
Doroteo T Toledano
Joaquin Gonzalez-Rodriguez
An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.
PLoS ONE
title An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.
title_full An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.
title_fullStr An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.
title_full_unstemmed An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.
title_short An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.
title_sort analysis of the influence of deep neural network dnn topology in bottleneck feature based language recognition
url http://europepmc.org/articles/PMC5552160?pdf=render
work_keys_str_mv AT alicialozanodiez ananalysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition
AT rubenzazo ananalysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition
AT doroteottoledano ananalysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition
AT joaquingonzalezrodriguez ananalysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition
AT alicialozanodiez analysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition
AT rubenzazo analysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition
AT doroteottoledano analysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition
AT joaquingonzalezrodriguez analysisoftheinfluenceofdeepneuralnetworkdnntopologyinbottleneckfeaturebasedlanguagerecognition