Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream

This paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one...

Full description

Bibliographic Details
Main Authors:	J. Nouza, J. Silovsky
Format:	Article
Language:	English
Published:	Spolecnost pro radioelektronicke inzenyrstvi 2006-09-01
Series:	Radioengineering
Subjects:	Speaker recognition Gaussian mixture models broadcast speech transcription
Online Access:	http://www.radioeng.cz/fulltexts/2006/06_03_42_48.pdf

_version_	1819144910531461120
author	J. Nouza J. Silovsky
author_facet	J. Nouza J. Silovsky
author_sort	J. Nouza
collection	DOAJ
description	This paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one of the speakers in the database, and if not, c) which gender the speaker belongs to. The result of the classification is used to extend the information provided by the transcription system and also to enhance the performance of the speech recognition module. Like the most of the state-of-the-art speaker recognition systems, the proposed one is based on Gaussian Mixture Models (GMM). As the number of the database speakers can be large, we introduce a technique that speeds up the identification process in significant way. Furthermore, we compare several approaches to the estimation of GMM parameters. Finally, we present the results achieved in classification of 230 minutes of real broadcast data.
first_indexed	2024-12-22T12:49:38Z
format	Article
id	doaj.art-dcc57f0babea4faab5c7e697f2f627ce
institution	Directory Open Access Journal
issn	1210-2512
language	English
last_indexed	2024-12-22T12:49:38Z
publishDate	2006-09-01
publisher	Spolecnost pro radioelektronicke inzenyrstvi
record_format	Article
series	Radioengineering
spelling	doaj.art-dcc57f0babea4faab5c7e697f2f627ce2022-12-21T18:25:14ZengSpolecnost pro radioelektronicke inzenyrstviRadioengineering1210-25122006-09-011534248Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast StreamJ. NouzaJ. SilovskyThis paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one of the speakers in the database, and if not, c) which gender the speaker belongs to. The result of the classification is used to extend the information provided by the transcription system and also to enhance the performance of the speech recognition module. Like the most of the state-of-the-art speaker recognition systems, the proposed one is based on Gaussian Mixture Models (GMM). As the number of the database speakers can be large, we introduce a technique that speeds up the identification process in significant way. Furthermore, we compare several approaches to the estimation of GMM parameters. Finally, we present the results achieved in classification of 230 minutes of real broadcast data.www.radioeng.cz/fulltexts/2006/06_03_42_48.pdfSpeaker recognitionGaussian mixture modelsbroadcast speech transcription
spellingShingle	J. Nouza J. Silovsky Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream Radioengineering Speaker recognition Gaussian mixture models broadcast speech transcription
title	Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_full	Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_fullStr	Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_full_unstemmed	Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_short	Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
title_sort	speech speaker and speaker s gender identification in automatically processed broadcast stream
topic	Speaker recognition Gaussian mixture models broadcast speech transcription
url	http://www.radioeng.cz/fulltexts/2006/06_03_42_48.pdf
work_keys_str_mv	AT jnouza speechspeakerandspeakersgenderidentificationinautomaticallyprocessedbroadcaststream AT jsilovsky speechspeakerandspeakersgenderidentificationinautomaticallyprocessedbroadcaststream

Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream

Similar Items