Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System

Automatic speech recognition (ASR) is an effective technique that can convert human speech into text format or computer actions. ASR systems are widely used in smart appliances, smart homes, and biometric systems. Signal processing and machine learning techniques are incorporated to recognize speech...

Full description

Bibliographic Details
Main Authors:	Mohammed Hasan Ali, Mustafa Musa Jaber, Sura Khalil Abd, Amjad Rehman, Mazhar Javed Awan, Daiva Vitkutė-Adžgauskienė, Robertas Damaševičius, Saeed Ali Bahaj
Format:	Article
Language:	English
Published:	MDPI AG 2022-01-01
Series:	Applied Sciences
Subjects:	automatic speech recognition Mel-frequency cepstral coefficients sparse auto-encoder neural network hidden Markov model natural language processing speech recognition
Online Access:	https://www.mdpi.com/2076-3417/12/3/1091

_version_	1827661806353514496
author	Mohammed Hasan Ali Mustafa Musa Jaber Sura Khalil Abd Amjad Rehman Mazhar Javed Awan Daiva Vitkutė-Adžgauskienė Robertas Damaševičius Saeed Ali Bahaj
author_facet	Mohammed Hasan Ali Mustafa Musa Jaber Sura Khalil Abd Amjad Rehman Mazhar Javed Awan Daiva Vitkutė-Adžgauskienė Robertas Damaševičius Saeed Ali Bahaj
author_sort	Mohammed Hasan Ali
collection	DOAJ
description	Automatic speech recognition (ASR) is an effective technique that can convert human speech into text format or computer actions. ASR systems are widely used in smart appliances, smart homes, and biometric systems. Signal processing and machine learning techniques are incorporated to recognize speech. However, traditional systems have low performance due to a noisy environment. In addition to this, accents and local differences negatively affect the ASR system’s performance while analyzing speech signals. A precise speech recognition system was developed to improve the system performance to overcome these issues. This paper uses speech information from jim-schwoebel voice datasets processed by Mel-frequency cepstral coefficients (MFCCs). The MFCC algorithm extracts the valuable features that are used to recognize speech. Here, a sparse auto-encoder (SAE) neural network is used to classify the model, and the hidden Markov model (HMM) is used to decide on the speech recognition. The network performance is optimized by applying the Harris Hawks optimization (HHO) algorithm to fine-tune the network parameter. The fine-tuned network can effectively recognize speech in a noisy environment.
first_indexed	2024-03-10T00:16:20Z
format	Article
id	doaj.art-d4f2719cafe64a56b8afad2a07e52ede
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-10T00:16:20Z
publishDate	2022-01-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-d4f2719cafe64a56b8afad2a07e52ede2023-11-23T15:51:42ZengMDPI AGApplied Sciences2076-34172022-01-01123109110.3390/app12031091Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition SystemMohammed Hasan Ali0Mustafa Musa Jaber1Sura Khalil Abd2Amjad Rehman3Mazhar Javed Awan4Daiva Vitkutė-Adžgauskienė5Robertas Damaševičius6Saeed Ali Bahaj7Computer Techniques Engineering Department, Faculty of Information Technology, Imam Ja’afar Al-sadiq University, Baghdad 10021, IraqDepartment of Computer Science, Dijlah University College, Baghdad 00964, IraqDepartment of Computer Science, Dijlah University College, Baghdad 00964, IraqArtificial Intelligence and Data Analytics Laboratory, College of Computer and Information Sciences (CCIS), Prince Sultan University, Riyadh 11586, Saudi ArabiaDepartment of Software Engineering, University of Management and Technology, Lahore 54770, PakistanDepartment of Applied Informatics, Vytautas Magnus University, 44404 Kaunas, LithuaniaDepartment of Applied Informatics, Vytautas Magnus University, 44404 Kaunas, LithuaniaMIS Department, College of Business Administration, Prince Sattam bin Abdulaziz University, Alkharj 11942, Saudi ArabiaAutomatic speech recognition (ASR) is an effective technique that can convert human speech into text format or computer actions. ASR systems are widely used in smart appliances, smart homes, and biometric systems. Signal processing and machine learning techniques are incorporated to recognize speech. However, traditional systems have low performance due to a noisy environment. In addition to this, accents and local differences negatively affect the ASR system’s performance while analyzing speech signals. A precise speech recognition system was developed to improve the system performance to overcome these issues. This paper uses speech information from jim-schwoebel voice datasets processed by Mel-frequency cepstral coefficients (MFCCs). The MFCC algorithm extracts the valuable features that are used to recognize speech. Here, a sparse auto-encoder (SAE) neural network is used to classify the model, and the hidden Markov model (HMM) is used to decide on the speech recognition. The network performance is optimized by applying the Harris Hawks optimization (HHO) algorithm to fine-tune the network parameter. The fine-tuned network can effectively recognize speech in a noisy environment.https://www.mdpi.com/2076-3417/12/3/1091automatic speech recognitionMel-frequency cepstral coefficientssparse auto-encoder neural networkhidden Markov modelnatural language processingspeech recognition
spellingShingle	Mohammed Hasan Ali Mustafa Musa Jaber Sura Khalil Abd Amjad Rehman Mazhar Javed Awan Daiva Vitkutė-Adžgauskienė Robertas Damaševičius Saeed Ali Bahaj Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System Applied Sciences automatic speech recognition Mel-frequency cepstral coefficients sparse auto-encoder neural network hidden Markov model natural language processing speech recognition
title	Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_full	Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_fullStr	Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_full_unstemmed	Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_short	Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_sort	harris hawks sparse auto encoder networks for automatic speech recognition system
topic	automatic speech recognition Mel-frequency cepstral coefficients sparse auto-encoder neural network hidden Markov model natural language processing speech recognition
url	https://www.mdpi.com/2076-3417/12/3/1091
work_keys_str_mv	AT mohammedhasanali harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem AT mustafamusajaber harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem AT surakhalilabd harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem AT amjadrehman harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem AT mazharjavedawan harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem AT daivavitkuteadzgauskiene harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem AT robertasdamasevicius harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem AT saeedalibahaj harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem

Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System

Similar Items