Speech feature extraction using linear Chirplet transform and its applications*

ABSTRACTMost speech processing models begin with feature extraction and then pass the feature vector to the primary processing model. The solution's performance mainly depends on the quality of the feature representation and the model architecture. Much research focuses on designing robust deep...

Full description

Bibliographic Details
Main Authors:	Hao Duc Do, Duc Thanh Chau, Son Thai Tran
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2023-07-01
Series:	Journal of Information and Telecommunication
Subjects:	speech representation time-frequency domain linear chirplet transform instantaneous frequency speech processing
Online Access:	https://www.tandfonline.com/doi/10.1080/24751839.2023.2207267

_version_	1797771240520286208
author	Hao Duc Do Duc Thanh Chau Son Thai Tran
author_facet	Hao Duc Do Duc Thanh Chau Son Thai Tran
author_sort	Hao Duc Do
collection	DOAJ
description	ABSTRACTMost speech processing models begin with feature extraction and then pass the feature vector to the primary processing model. The solution's performance mainly depends on the quality of the feature representation and the model architecture. Much research focuses on designing robust deep network architecture and ignoring feature representation's important role during the deep neural network era. This work aims to exploit a new approach to design a speech signal representation in the time-frequency domain via Linear Chirplet Transform (LCT). The proposed method provides a feature vector sensitive to the frequency change inside human speech with a solid mathematical foundation. This is a potential direction for many applications. The experimental results show the improvement of the feature based on LCT compared to MFCC or Fourier Transform. In both speaker gender recognition, dialect recognition, and speech recognition, LCT significantly improved compared with MFCC and other features. This result also implies that the feature based on LCT is independent of language, so it can be used in various applications.
first_indexed	2024-03-12T21:34:40Z
format	Article
id	doaj.art-993f6de1fc2046089139142a26c22c57
institution	Directory Open Access Journal
issn	2475-1839 2475-1847
language	English
last_indexed	2024-03-12T21:34:40Z
publishDate	2023-07-01
publisher	Taylor & Francis Group
record_format	Article
series	Journal of Information and Telecommunication
spelling	doaj.art-993f6de1fc2046089139142a26c22c572023-07-27T11:47:07ZengTaylor & Francis GroupJournal of Information and Telecommunication2475-18392475-18472023-07-017337639110.1080/24751839.2023.2207267Speech feature extraction using linear Chirplet transform and its applications*Hao Duc Do0Duc Thanh Chau1Son Thai Tran2Vietnam National University, Ho Chi Minh City, VietnamVietnam National University, Ho Chi Minh City, VietnamVietnam National University, Ho Chi Minh City, VietnamABSTRACTMost speech processing models begin with feature extraction and then pass the feature vector to the primary processing model. The solution's performance mainly depends on the quality of the feature representation and the model architecture. Much research focuses on designing robust deep network architecture and ignoring feature representation's important role during the deep neural network era. This work aims to exploit a new approach to design a speech signal representation in the time-frequency domain via Linear Chirplet Transform (LCT). The proposed method provides a feature vector sensitive to the frequency change inside human speech with a solid mathematical foundation. This is a potential direction for many applications. The experimental results show the improvement of the feature based on LCT compared to MFCC or Fourier Transform. In both speaker gender recognition, dialect recognition, and speech recognition, LCT significantly improved compared with MFCC and other features. This result also implies that the feature based on LCT is independent of language, so it can be used in various applications.https://www.tandfonline.com/doi/10.1080/24751839.2023.2207267speech representationtime-frequency domainlinear chirplet transforminstantaneous frequencyspeech processing
spellingShingle	Hao Duc Do Duc Thanh Chau Son Thai Tran Speech feature extraction using linear Chirplet transform and its applications* Journal of Information and Telecommunication speech representation time-frequency domain linear chirplet transform instantaneous frequency speech processing
title	Speech feature extraction using linear Chirplet transform and its applications*
title_full	Speech feature extraction using linear Chirplet transform and its applications*
title_fullStr	Speech feature extraction using linear Chirplet transform and its applications*
title_full_unstemmed	Speech feature extraction using linear Chirplet transform and its applications*
title_short	Speech feature extraction using linear Chirplet transform and its applications*
title_sort	speech feature extraction using linear chirplet transform and its applications
topic	speech representation time-frequency domain linear chirplet transform instantaneous frequency speech processing
url	https://www.tandfonline.com/doi/10.1080/24751839.2023.2207267
work_keys_str_mv	AT haoducdo speechfeatureextractionusinglinearchirplettransformanditsapplications AT ducthanhchau speechfeatureextractionusinglinearchirplettransformanditsapplications AT sonthaitran speechfeatureextractionusinglinearchirplettransformanditsapplications

Speech feature extraction using linear Chirplet transform and its applications*

Similar Items