Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream
This paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Spolecnost pro radioelektronicke inzenyrstvi
2006-09-01
|
Series: | Radioengineering |
Subjects: | |
Online Access: | http://www.radioeng.cz/fulltexts/2006/06_03_42_48.pdf |
_version_ | 1819144910531461120 |
---|---|
author | J. Nouza J. Silovsky |
author_facet | J. Nouza J. Silovsky |
author_sort | J. Nouza |
collection | DOAJ |
description | This paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one of the speakers in the database, and if not, c) which gender the speaker belongs to. The result of the classification is used to extend the information provided by the transcription system and also to enhance the performance of the speech recognition module. Like the most of the state-of-the-art speaker recognition systems, the proposed one is based on Gaussian Mixture Models (GMM). As the number of the database speakers can be large, we introduce a technique that speeds up the identification process in significant way. Furthermore, we compare several approaches to the estimation of GMM parameters. Finally, we present the results achieved in classification of 230 minutes of real broadcast data. |
first_indexed | 2024-12-22T12:49:38Z |
format | Article |
id | doaj.art-dcc57f0babea4faab5c7e697f2f627ce |
institution | Directory Open Access Journal |
issn | 1210-2512 |
language | English |
last_indexed | 2024-12-22T12:49:38Z |
publishDate | 2006-09-01 |
publisher | Spolecnost pro radioelektronicke inzenyrstvi |
record_format | Article |
series | Radioengineering |
spelling | doaj.art-dcc57f0babea4faab5c7e697f2f627ce2022-12-21T18:25:14ZengSpolecnost pro radioelektronicke inzenyrstviRadioengineering1210-25122006-09-011534248Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast StreamJ. NouzaJ. SilovskyThis paper presents a set of techniques for classification of audiosegments in a system for automatic transcription of broadcast programs. The task consists in deciding a) whether the segment is to be labeled as speech or a non-speech one, and in the former case, b) whether the talking person is one of the speakers in the database, and if not, c) which gender the speaker belongs to. The result of the classification is used to extend the information provided by the transcription system and also to enhance the performance of the speech recognition module. Like the most of the state-of-the-art speaker recognition systems, the proposed one is based on Gaussian Mixture Models (GMM). As the number of the database speakers can be large, we introduce a technique that speeds up the identification process in significant way. Furthermore, we compare several approaches to the estimation of GMM parameters. Finally, we present the results achieved in classification of 230 minutes of real broadcast data.www.radioeng.cz/fulltexts/2006/06_03_42_48.pdfSpeaker recognitionGaussian mixture modelsbroadcast speech transcription |
spellingShingle | J. Nouza J. Silovsky Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream Radioengineering Speaker recognition Gaussian mixture models broadcast speech transcription |
title | Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream |
title_full | Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream |
title_fullStr | Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream |
title_full_unstemmed | Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream |
title_short | Speech, Speaker and Speaker's Gender Identification in Automatically Processed Broadcast Stream |
title_sort | speech speaker and speaker s gender identification in automatically processed broadcast stream |
topic | Speaker recognition Gaussian mixture models broadcast speech transcription |
url | http://www.radioeng.cz/fulltexts/2006/06_03_42_48.pdf |
work_keys_str_mv | AT jnouza speechspeakerandspeakersgenderidentificationinautomaticallyprocessedbroadcaststream AT jsilovsky speechspeakerandspeakersgenderidentificationinautomaticallyprocessedbroadcaststream |