Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System

Automatic speech recognition (ASR) is an effective technique that can convert human speech into text format or computer actions. ASR systems are widely used in smart appliances, smart homes, and biometric systems. Signal processing and machine learning techniques are incorporated to recognize speech...

Full description

Bibliographic Details
Main Authors: Mohammed Hasan Ali, Mustafa Musa Jaber, Sura Khalil Abd, Amjad Rehman, Mazhar Javed Awan, Daiva Vitkutė-Adžgauskienė, Robertas Damaševičius, Saeed Ali Bahaj
Format: Article
Language:English
Published: MDPI AG 2022-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/3/1091
_version_ 1797489356554895360
author Mohammed Hasan Ali
Mustafa Musa Jaber
Sura Khalil Abd
Amjad Rehman
Mazhar Javed Awan
Daiva Vitkutė-Adžgauskienė
Robertas Damaševičius
Saeed Ali Bahaj
author_facet Mohammed Hasan Ali
Mustafa Musa Jaber
Sura Khalil Abd
Amjad Rehman
Mazhar Javed Awan
Daiva Vitkutė-Adžgauskienė
Robertas Damaševičius
Saeed Ali Bahaj
author_sort Mohammed Hasan Ali
collection DOAJ
description Automatic speech recognition (ASR) is an effective technique that can convert human speech into text format or computer actions. ASR systems are widely used in smart appliances, smart homes, and biometric systems. Signal processing and machine learning techniques are incorporated to recognize speech. However, traditional systems have low performance due to a noisy environment. In addition to this, accents and local differences negatively affect the ASR system’s performance while analyzing speech signals. A precise speech recognition system was developed to improve the system performance to overcome these issues. This paper uses speech information from jim-schwoebel voice datasets processed by Mel-frequency cepstral coefficients (MFCCs). The MFCC algorithm extracts the valuable features that are used to recognize speech. Here, a sparse auto-encoder (SAE) neural network is used to classify the model, and the hidden Markov model (HMM) is used to decide on the speech recognition. The network performance is optimized by applying the Harris Hawks optimization (HHO) algorithm to fine-tune the network parameter. The fine-tuned network can effectively recognize speech in a noisy environment.
first_indexed 2024-03-10T00:16:20Z
format Article
id doaj.art-d4f2719cafe64a56b8afad2a07e52ede
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T00:16:20Z
publishDate 2022-01-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-d4f2719cafe64a56b8afad2a07e52ede2023-11-23T15:51:42ZengMDPI AGApplied Sciences2076-34172022-01-01123109110.3390/app12031091Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition SystemMohammed Hasan Ali0Mustafa Musa Jaber1Sura Khalil Abd2Amjad Rehman3Mazhar Javed Awan4Daiva Vitkutė-Adžgauskienė5Robertas Damaševičius6Saeed Ali Bahaj7Computer Techniques Engineering Department, Faculty of Information Technology, Imam Ja’afar Al-sadiq University, Baghdad 10021, IraqDepartment of Computer Science, Dijlah University College, Baghdad 00964, IraqDepartment of Computer Science, Dijlah University College, Baghdad 00964, IraqArtificial Intelligence and Data Analytics Laboratory, College of Computer and Information Sciences (CCIS), Prince Sultan University, Riyadh 11586, Saudi ArabiaDepartment of Software Engineering, University of Management and Technology, Lahore 54770, PakistanDepartment of Applied Informatics, Vytautas Magnus University, 44404 Kaunas, LithuaniaDepartment of Applied Informatics, Vytautas Magnus University, 44404 Kaunas, LithuaniaMIS Department, College of Business Administration, Prince Sattam bin Abdulaziz University, Alkharj 11942, Saudi ArabiaAutomatic speech recognition (ASR) is an effective technique that can convert human speech into text format or computer actions. ASR systems are widely used in smart appliances, smart homes, and biometric systems. Signal processing and machine learning techniques are incorporated to recognize speech. However, traditional systems have low performance due to a noisy environment. In addition to this, accents and local differences negatively affect the ASR system’s performance while analyzing speech signals. A precise speech recognition system was developed to improve the system performance to overcome these issues. This paper uses speech information from jim-schwoebel voice datasets processed by Mel-frequency cepstral coefficients (MFCCs). The MFCC algorithm extracts the valuable features that are used to recognize speech. Here, a sparse auto-encoder (SAE) neural network is used to classify the model, and the hidden Markov model (HMM) is used to decide on the speech recognition. The network performance is optimized by applying the Harris Hawks optimization (HHO) algorithm to fine-tune the network parameter. The fine-tuned network can effectively recognize speech in a noisy environment.https://www.mdpi.com/2076-3417/12/3/1091automatic speech recognitionMel-frequency cepstral coefficientssparse auto-encoder neural networkhidden Markov modelnatural language processingspeech recognition
spellingShingle Mohammed Hasan Ali
Mustafa Musa Jaber
Sura Khalil Abd
Amjad Rehman
Mazhar Javed Awan
Daiva Vitkutė-Adžgauskienė
Robertas Damaševičius
Saeed Ali Bahaj
Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
Applied Sciences
automatic speech recognition
Mel-frequency cepstral coefficients
sparse auto-encoder neural network
hidden Markov model
natural language processing
speech recognition
title Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_full Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_fullStr Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_full_unstemmed Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_short Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
title_sort harris hawks sparse auto encoder networks for automatic speech recognition system
topic automatic speech recognition
Mel-frequency cepstral coefficients
sparse auto-encoder neural network
hidden Markov model
natural language processing
speech recognition
url https://www.mdpi.com/2076-3417/12/3/1091
work_keys_str_mv AT mohammedhasanali harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem
AT mustafamusajaber harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem
AT surakhalilabd harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem
AT amjadrehman harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem
AT mazharjavedawan harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem
AT daivavitkuteadzgauskiene harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem
AT robertasdamasevicius harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem
AT saeedalibahaj harrishawkssparseautoencodernetworksforautomaticspeechrecognitionsystem