A Mel-weighted Spectrogram Feature Extraction for Improved Speaker Recognition System
Speaker recognition system is intended to recognize a person’s identity. This task can be done by knowing the feature of the contained voice signals. The feature can be extracted using a feature extraction technique. One of the most popular feature extraction techniques is Mel-Frequency Cepstral Co...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi
2022
|
Subjects: | |
Online Access: | https://repository.ugm.ac.id/278617/1/Astuti_TK.pdf |
_version_ | 1826050289947377664 |
---|---|
author | Astuti, Yenni Hidayat, Risanuri Bejo, Agus |
author_facet | Astuti, Yenni Hidayat, Risanuri Bejo, Agus |
author_sort | Astuti, Yenni |
collection | UGM |
description | Speaker recognition system is intended to recognize a person’s identity. This task can be done by knowing the feature of the contained voice signals. The feature can be extracted using a feature extraction technique. One of
the most popular feature extraction techniques is Mel-Frequency Cepstral Coefficient (MFCC). Until this time, the
MFCC is still a challenging technique to be developed. In this work, a proposed technique is developed based on the
MFCC. The original MFCC is modified in the process of filter bank and by adding spectrogram. The target of this work is to obtain a better recognition accuracy than the original MFCC. This target can be done by applying a technique called Mel-weighted spectrogram. The output of the proposed technique are spectrogram images which contain the feature of the voices. The spectrogram result is then classified using dissimilarity space based on Euclidean distance to identify the person’s identity. For the dataset, this work uses 315 recorded voice signals, consisting of 3 speakers, each pronouncing five words repeatedly 21 times on seven different days. The performance of this system is evaluated by comparing the percentage of accuracy among the proposed technique, the original MFCC, and two other MFCC-based techniques. In this work, the proposed technique is better than the three other techniques with an accuracy up to 88.57%. From these results, the Mel-weighted spectrogram can be considered as a recommendation for obtaining a higher recognition rate in speaker recognition system. |
first_indexed | 2024-03-14T00:01:33Z |
format | Article |
id | oai:generic.eprints.org:278617 |
institution | Universiti Gadjah Mada |
language | English |
last_indexed | 2024-03-14T00:01:33Z |
publishDate | 2022 |
publisher | Hindawi |
record_format | dspace |
spelling | oai:generic.eprints.org:2786172023-11-02T01:57:58Z https://repository.ugm.ac.id/278617/ A Mel-weighted Spectrogram Feature Extraction for Improved Speaker Recognition System Astuti, Yenni Hidayat, Risanuri Bejo, Agus Electrical and Electronic Engineering Engineering Speaker recognition system is intended to recognize a person’s identity. This task can be done by knowing the feature of the contained voice signals. The feature can be extracted using a feature extraction technique. One of the most popular feature extraction techniques is Mel-Frequency Cepstral Coefficient (MFCC). Until this time, the MFCC is still a challenging technique to be developed. In this work, a proposed technique is developed based on the MFCC. The original MFCC is modified in the process of filter bank and by adding spectrogram. The target of this work is to obtain a better recognition accuracy than the original MFCC. This target can be done by applying a technique called Mel-weighted spectrogram. The output of the proposed technique are spectrogram images which contain the feature of the voices. The spectrogram result is then classified using dissimilarity space based on Euclidean distance to identify the person’s identity. For the dataset, this work uses 315 recorded voice signals, consisting of 3 speakers, each pronouncing five words repeatedly 21 times on seven different days. The performance of this system is evaluated by comparing the percentage of accuracy among the proposed technique, the original MFCC, and two other MFCC-based techniques. In this work, the proposed technique is better than the three other techniques with an accuracy up to 88.57%. From these results, the Mel-weighted spectrogram can be considered as a recommendation for obtaining a higher recognition rate in speaker recognition system. Hindawi 2022 Article PeerReviewed application/pdf en https://repository.ugm.ac.id/278617/1/Astuti_TK.pdf Astuti, Yenni and Hidayat, Risanuri and Bejo, Agus (2022) A Mel-weighted Spectrogram Feature Extraction for Improved Speaker Recognition System. International Journal of Intelligent Engineering and Systems, 15 (6). pp. 74-82. ISSN 2185-3118 https://www.hindawi.com/journals/ijis/ 10.22266/ijies2022.1231.08 |
spellingShingle | Electrical and Electronic Engineering Engineering Astuti, Yenni Hidayat, Risanuri Bejo, Agus A Mel-weighted Spectrogram Feature Extraction for Improved Speaker Recognition System |
title | A Mel-weighted Spectrogram Feature Extraction for Improved Speaker Recognition System |
title_full | A Mel-weighted Spectrogram Feature Extraction for Improved Speaker Recognition System |
title_fullStr | A Mel-weighted Spectrogram Feature Extraction for Improved Speaker Recognition System |
title_full_unstemmed | A Mel-weighted Spectrogram Feature Extraction for Improved Speaker Recognition System |
title_short | A Mel-weighted Spectrogram Feature Extraction for Improved Speaker Recognition System |
title_sort | mel weighted spectrogram feature extraction for improved speaker recognition system |
topic | Electrical and Electronic Engineering Engineering |
url | https://repository.ugm.ac.id/278617/1/Astuti_TK.pdf |
work_keys_str_mv | AT astutiyenni amelweightedspectrogramfeatureextractionforimprovedspeakerrecognitionsystem AT hidayatrisanuri amelweightedspectrogramfeatureextractionforimprovedspeakerrecognitionsystem AT bejoagus amelweightedspectrogramfeatureextractionforimprovedspeakerrecognitionsystem AT astutiyenni melweightedspectrogramfeatureextractionforimprovedspeakerrecognitionsystem AT hidayatrisanuri melweightedspectrogramfeatureextractionforimprovedspeakerrecognitionsystem AT bejoagus melweightedspectrogramfeatureextractionforimprovedspeakerrecognitionsystem |