Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream

This paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one...

Full description

Bibliographic Details
Main Authors: J. Nouza, J. Silovsky
Format: Article
Language:English
Published: Spolecnost pro radioelektronicke inzenyrstvi 2006-09-01
Series:Radioengineering
Subjects:
Online Access:http://www.radioeng.cz/fulltexts/2006/06_03_42_48.pdf
_version_ 1819144910531461120
author J. Nouza
J. Silovsky
author_facet J. Nouza
J. Silovsky
author_sort J. Nouza
collection DOAJ
description This paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one of the speakers in the database, and if not, c) which gender the speaker belongs to. The result of the classification is used to extend the information provided by the transcription system and also to enhance the performance of the speech recognition module. Like the most of the state-of-the-art speaker recognition systems, the proposed one is based on Gaussian Mixture Models (GMM). As the number of the database speakers can be large, we introduce a technique that speeds up the identification process in significant way. Furthermore, we compare several approaches to the estimation of GMM parameters. Finally, we present the results achieved in classification of 230 minutes of real broadcast data.
first_indexed 2024-12-22T12:49:38Z
format Article
id doaj.art-dcc57f0babea4faab5c7e697f2f627ce
institution Directory Open Access Journal
issn 1210-2512
language English
last_indexed 2024-12-22T12:49:38Z
publishDate 2006-09-01
publisher Spolecnost pro radioelektronicke inzenyrstvi
record_format Article
series Radioengineering
spelling doaj.art-dcc57f0babea4faab5c7e697f2f627ce2022-12-21T18:25:14ZengSpolecnost pro radioelektronicke inzenyrstviRadioengineering1210-25122006-09-011534248Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast StreamJ. NouzaJ. SilovskyThis paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one of the speakers in the database, and if not, c) which gender the speaker belongs to. The result of the classification is used to extend the information provided by the transcription system and also to enhance the performance of the speech recognition module. Like the most of the state-of-the-art speaker recognition systems, the proposed one is based on Gaussian Mixture Models (GMM). As the number of the database speakers can be large, we introduce a technique that speeds up the identification process in significant way. Furthermore, we compare several approaches to the estimation of GMM parameters. Finally, we present the results achieved in classification of 230 minutes of real broadcast data.www.radioeng.cz/fulltexts/2006/06_03_42_48.pdfSpeaker recognitionGaussian mixture modelsbroadcast speech transcription
spellingShingle J. Nouza
J. Silovsky
Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
Radioengineering
Speaker recognition
Gaussian mixture models
broadcast speech transcription
title Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_full Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_fullStr Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_full_unstemmed Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_short Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_sort speech speaker and speaker s gender identification in automatically processed broadcast stream
topic Speaker recognition
Gaussian mixture models
broadcast speech transcription
url http://www.radioeng.cz/fulltexts/2006/06_03_42_48.pdf
work_keys_str_mv AT jnouza speechspeakerandspeakersgenderidentificationinautomaticallyprocessedbroadcaststream
AT jsilovsky speechspeakerandspeakersgenderidentificationinautomaticallyprocessedbroadcaststream