Pitch Correlogram Clustering for Fast Speaker Identification
Gaussian mixture models (GMMs) are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the datab...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2004-12-01
|
Series: | EURASIP Journal on Advances in Signal Processing |
Subjects: | |
Online Access: | http://dx.doi.org/10.1155/S1110865704408026 |
_version_ | 1818218269491855360 |
---|---|
author | Nitin Jhanwar Ajay K. Raina |
author_facet | Nitin Jhanwar Ajay K. Raina |
author_sort | Nitin Jhanwar |
collection | DOAJ |
description | Gaussian mixture models (GMMs) are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the database is partitioned into clusters based on some proximity criteria and only a single-cluster GMM is run in every test, have been suggested in literature to speed up the identification process. However, most clustering algorithms used have shown limited success, apparently because the clustering and GMM feature spaces used are derived from similar speech characteristics. This paper presents a new clustering approach based on the concept of a pitch correlogram that captures frame-to-frame pitch variations of a speaker rather than short-time spectral characteristics like cepstral coefficient, spectral slopes, and so forth. The effectiveness of this two-stage identification process is demonstrated on the IVIE corpus of 110 speakers. The overall system achieves a run-time advantage of 500% as well as a 10% reduction of error in overall speaker identification. |
first_indexed | 2024-12-12T07:21:05Z |
format | Article |
id | doaj.art-139b72c7ad064ff7b12e2a28120c62ff |
institution | Directory Open Access Journal |
issn | 1687-6172 1687-6180 |
language | English |
last_indexed | 2024-12-12T07:21:05Z |
publishDate | 2004-12-01 |
publisher | SpringerOpen |
record_format | Article |
series | EURASIP Journal on Advances in Signal Processing |
spelling | doaj.art-139b72c7ad064ff7b12e2a28120c62ff2022-12-22T00:33:21ZengSpringerOpenEURASIP Journal on Advances in Signal Processing1687-61721687-61802004-12-012004172640264910.1155/S1687617204408026Pitch Correlogram Clustering for Fast Speaker IdentificationNitin JhanwarAjay K. RainaGaussian mixture models (GMMs) are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the database is partitioned into clusters based on some proximity criteria and only a single-cluster GMM is run in every test, have been suggested in literature to speed up the identification process. However, most clustering algorithms used have shown limited success, apparently because the clustering and GMM feature spaces used are derived from similar speech characteristics. This paper presents a new clustering approach based on the concept of a pitch correlogram that captures frame-to-frame pitch variations of a speaker rather than short-time spectral characteristics like cepstral coefficient, spectral slopes, and so forth. The effectiveness of this two-stage identification process is demonstrated on the IVIE corpus of 110 speakers. The overall system achieves a run-time advantage of 500% as well as a 10% reduction of error in overall speaker identification.http://dx.doi.org/10.1155/S1110865704408026speaker identificationclusteringpitchcorrelogram. |
spellingShingle | Nitin Jhanwar Ajay K. Raina Pitch Correlogram Clustering for Fast Speaker Identification EURASIP Journal on Advances in Signal Processing speaker identification clustering pitch correlogram. |
title | Pitch Correlogram Clustering for Fast Speaker Identification |
title_full | Pitch Correlogram Clustering for Fast Speaker Identification |
title_fullStr | Pitch Correlogram Clustering for Fast Speaker Identification |
title_full_unstemmed | Pitch Correlogram Clustering for Fast Speaker Identification |
title_short | Pitch Correlogram Clustering for Fast Speaker Identification |
title_sort | pitch correlogram clustering for fast speaker identification |
topic | speaker identification clustering pitch correlogram. |
url | http://dx.doi.org/10.1155/S1110865704408026 |
work_keys_str_mv | AT nitinjhanwar pitchcorrelogramclusteringforfastspeakeridentification AT ajaykraina pitchcorrelogramclusteringforfastspeakeridentification |