Pitch Correlogram Clustering for Fast Speaker Identification

Gaussian mixture models (GMMs) are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the datab...

Full description

Bibliographic Details
Main Authors: Nitin Jhanwar, Ajay K. Raina
Format: Article
Language:English
Published: SpringerOpen 2004-12-01
Series:EURASIP Journal on Advances in Signal Processing
Subjects:
Online Access:http://dx.doi.org/10.1155/S1110865704408026
_version_ 1818218269491855360
author Nitin Jhanwar
Ajay K. Raina
author_facet Nitin Jhanwar
Ajay K. Raina
author_sort Nitin Jhanwar
collection DOAJ
description Gaussian mixture models (GMMs) are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the database is partitioned into clusters based on some proximity criteria and only a single-cluster GMM is run in every test, have been suggested in literature to speed up the identification process. However, most clustering algorithms used have shown limited success, apparently because the clustering and GMM feature spaces used are derived from similar speech characteristics. This paper presents a new clustering approach based on the concept of a pitch correlogram that captures frame-to-frame pitch variations of a speaker rather than short-time spectral characteristics like cepstral coefficient, spectral slopes, and so forth. The effectiveness of this two-stage identification process is demonstrated on the IVIE corpus of 110 speakers. The overall system achieves a run-time advantage of 500% as well as a 10% reduction of error in overall speaker identification.
first_indexed 2024-12-12T07:21:05Z
format Article
id doaj.art-139b72c7ad064ff7b12e2a28120c62ff
institution Directory Open Access Journal
issn 1687-6172
1687-6180
language English
last_indexed 2024-12-12T07:21:05Z
publishDate 2004-12-01
publisher SpringerOpen
record_format Article
series EURASIP Journal on Advances in Signal Processing
spelling doaj.art-139b72c7ad064ff7b12e2a28120c62ff2022-12-22T00:33:21ZengSpringerOpenEURASIP Journal on Advances in Signal Processing1687-61721687-61802004-12-012004172640264910.1155/S1687617204408026Pitch Correlogram Clustering for Fast Speaker IdentificationNitin JhanwarAjay K. RainaGaussian mixture models (GMMs) are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the database is partitioned into clusters based on some proximity criteria and only a single-cluster GMM is run in every test, have been suggested in literature to speed up the identification process. However, most clustering algorithms used have shown limited success, apparently because the clustering and GMM feature spaces used are derived from similar speech characteristics. This paper presents a new clustering approach based on the concept of a pitch correlogram that captures frame-to-frame pitch variations of a speaker rather than short-time spectral characteristics like cepstral coefficient, spectral slopes, and so forth. The effectiveness of this two-stage identification process is demonstrated on the IVIE corpus of 110 speakers. The overall system achieves a run-time advantage of 500% as well as a 10% reduction of error in overall speaker identification.http://dx.doi.org/10.1155/S1110865704408026speaker identificationclusteringpitchcorrelogram.
spellingShingle Nitin Jhanwar
Ajay K. Raina
Pitch Correlogram Clustering for Fast Speaker Identification
EURASIP Journal on Advances in Signal Processing
speaker identification
clustering
pitch
correlogram.
title Pitch Correlogram Clustering for Fast Speaker Identification
title_full Pitch Correlogram Clustering for Fast Speaker Identification
title_fullStr Pitch Correlogram Clustering for Fast Speaker Identification
title_full_unstemmed Pitch Correlogram Clustering for Fast Speaker Identification
title_short Pitch Correlogram Clustering for Fast Speaker Identification
title_sort pitch correlogram clustering for fast speaker identification
topic speaker identification
clustering
pitch
correlogram.
url http://dx.doi.org/10.1155/S1110865704408026
work_keys_str_mv AT nitinjhanwar pitchcorrelogramclusteringforfastspeakeridentification
AT ajaykraina pitchcorrelogramclusteringforfastspeakeridentification