Multi-Label Extreme Learning Machine (MLELMs) for Bangla Regional Speech Recognition

Extensive research has been conducted in the past to determine age, gender, and words spoken in Bangla speech, but no work has been conducted to identify the regional language spoken by the speaker in Bangla speech. Hence, in this study, we create a dataset containing 30 h of Bangla speech of seven...

Full description

Bibliographic Details
Main Authors:	Prommy Sultana Hossain, Amitabha Chakrabarty, Kyuheon Kim, Md. Jalil Piran
Format:	Article
Language:	English
Published:	MDPI AG 2022-05-01
Series:	Applied Sciences
Subjects:	Bangla regional speech classification Stacked Convolution Autoencoder (SCAE) Multi-Label Extreme Learning machine (MLELMs) Mel Frequency Energy Coefficients (MFECs)
Online Access:	https://www.mdpi.com/2076-3417/12/11/5463

_version_	1797494213205557248
author	Prommy Sultana Hossain Amitabha Chakrabarty Kyuheon Kim Md. Jalil Piran
author_facet	Prommy Sultana Hossain Amitabha Chakrabarty Kyuheon Kim Md. Jalil Piran
author_sort	Prommy Sultana Hossain
collection	DOAJ
description	Extensive research has been conducted in the past to determine age, gender, and words spoken in Bangla speech, but no work has been conducted to identify the regional language spoken by the speaker in Bangla speech. Hence, in this study, we create a dataset containing 30 h of Bangla speech of seven regional Bangla dialects with the goal of detecting synthesized Bangla speech and categorizing it. To categorize the regional language spoken by the speaker in the Bangla speech and determine its authenticity, the proposed model was created; a Stacked Convolutional Autoencoder (SCAE) and a Sequence of Multi-Label Extreme Learning machines (MLELM). SCAE creates a detailed feature map by identifying the spatial and temporal salient qualities from MFEC input data. The feature map is then sent to MLELM networks to generate soft labels and then hard labels. As aging generates physiological changes in the brain that alter the processing of aural information, the model took age class into account while generating dialect class labels, increasing classification accuracy from 85% to 95% without and with age class consideration, respectively. The classification accuracy for synthesized Bangla speech labels is 95%. The proposed methodology works well with English speaking audio sets as well.
first_indexed	2024-03-10T01:31:06Z
format	Article
id	doaj.art-e261f25a60514cd9996bdef4d6b9cc18
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-10T01:31:06Z
publishDate	2022-05-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-e261f25a60514cd9996bdef4d6b9cc182023-11-23T13:42:08ZengMDPI AGApplied Sciences2076-34172022-05-011211546310.3390/app12115463Multi-Label Extreme Learning Machine (MLELMs) for Bangla Regional Speech RecognitionPrommy Sultana Hossain0Amitabha Chakrabarty1Kyuheon Kim2Md. Jalil Piran3Department of Computer Science and Engineering, Brac University, Dhaka 1212, BangladeshDepartment of Computer Science and Engineering, Brac University, Dhaka 1212, BangladeshMedia Laboratory, Kyung Hee University, Yong-in 17104, KoreaDepartment of Computer Science and Engineering, Sejeong University, Seoul 05006, KoreaExtensive research has been conducted in the past to determine age, gender, and words spoken in Bangla speech, but no work has been conducted to identify the regional language spoken by the speaker in Bangla speech. Hence, in this study, we create a dataset containing 30 h of Bangla speech of seven regional Bangla dialects with the goal of detecting synthesized Bangla speech and categorizing it. To categorize the regional language spoken by the speaker in the Bangla speech and determine its authenticity, the proposed model was created; a Stacked Convolutional Autoencoder (SCAE) and a Sequence of Multi-Label Extreme Learning machines (MLELM). SCAE creates a detailed feature map by identifying the spatial and temporal salient qualities from MFEC input data. The feature map is then sent to MLELM networks to generate soft labels and then hard labels. As aging generates physiological changes in the brain that alter the processing of aural information, the model took age class into account while generating dialect class labels, increasing classification accuracy from 85% to 95% without and with age class consideration, respectively. The classification accuracy for synthesized Bangla speech labels is 95%. The proposed methodology works well with English speaking audio sets as well.https://www.mdpi.com/2076-3417/12/11/5463Bangla regional speech classificationStacked Convolution Autoencoder (SCAE)Multi-Label Extreme Learning machine (MLELMs)Mel Frequency Energy Coefficients (MFECs)
spellingShingle	Prommy Sultana Hossain Amitabha Chakrabarty Kyuheon Kim Md. Jalil Piran Multi-Label Extreme Learning Machine (MLELMs) for Bangla Regional Speech Recognition Applied Sciences Bangla regional speech classification Stacked Convolution Autoencoder (SCAE) Multi-Label Extreme Learning machine (MLELMs) Mel Frequency Energy Coefficients (MFECs)
title	Multi-Label Extreme Learning Machine (MLELMs) for Bangla Regional Speech Recognition
title_full	Multi-Label Extreme Learning Machine (MLELMs) for Bangla Regional Speech Recognition
title_fullStr	Multi-Label Extreme Learning Machine (MLELMs) for Bangla Regional Speech Recognition
title_full_unstemmed	Multi-Label Extreme Learning Machine (MLELMs) for Bangla Regional Speech Recognition
title_short	Multi-Label Extreme Learning Machine (MLELMs) for Bangla Regional Speech Recognition
title_sort	multi label extreme learning machine mlelms for bangla regional speech recognition
topic	Bangla regional speech classification Stacked Convolution Autoencoder (SCAE) Multi-Label Extreme Learning machine (MLELMs) Mel Frequency Energy Coefficients (MFECs)
url	https://www.mdpi.com/2076-3417/12/11/5463
work_keys_str_mv	AT prommysultanahossain multilabelextremelearningmachinemlelmsforbanglaregionalspeechrecognition AT amitabhachakrabarty multilabelextremelearningmachinemlelmsforbanglaregionalspeechrecognition AT kyuheonkim multilabelextremelearningmachinemlelmsforbanglaregionalspeechrecognition AT mdjalilpiran multilabelextremelearningmachinemlelmsforbanglaregionalspeechrecognition

Multi-Label Extreme Learning Machine (MLELMs) for Bangla Regional Speech Recognition

Similar Items