Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models

Sign language is the primary communication tool used by the deaf community and people with speaking difficulties, especially during emergencies. Numerous deep learning models have been proposed to solve the sign language recognition problem. Recently. Bidirectional LSTM (BLSTM) has been proposed and...

Full description

Bibliographic Details
Main Authors: Muhammad Amir As'ari, Nur Anis Jasmin Sufri, Guat Si Qi
Format: Article
Language:English
Published: Universitas Ahmad Dahlan 2024-02-01
Series:IJAIN (International Journal of Advances in Intelligent Informatics)
Subjects:
Online Access:http://ijain.org/index.php/IJAIN/article/view/1170
_version_ 1797269097537339392
author Muhammad Amir As'ari
Nur Anis Jasmin Sufri
Guat Si Qi
author_facet Muhammad Amir As'ari
Nur Anis Jasmin Sufri
Guat Si Qi
author_sort Muhammad Amir As'ari
collection DOAJ
description Sign language is the primary communication tool used by the deaf community and people with speaking difficulties, especially during emergencies. Numerous deep learning models have been proposed to solve the sign language recognition problem. Recently. Bidirectional LSTM (BLSTM) has been proposed and used in replacement of Long Short-Term Memory (LSTM) as it may improve learning long-team dependencies as well as increase the accuracy of the model. However, there needs to be more comparison for the performance of LSTM and BLSTM in LRCN model architecture in sign language interpretation applications. Therefore, this study focused on the dense analysis of the LRCN model, including 1) training the CNN from scratch and 2) modeling with pre-trained CNN, VGG-19, and ResNet50. Other than that, the ConvLSTM model, a special variant of LSTM designed for video input, has also been modeled and compared with the LRCN in representing emergency sign language recognition. Within LRCN variants, the performance of a small CNN network was compared with pre-trained VGG-19 and ResNet50V2. A dataset of emergency Indian Sign Language with eight classes is used to train the models. The model with the best performance is the VGG-19 + LSTM model, with a testing accuracy of 96.39%. Small LRCN networks, which are 5 CNN subunits + LSTM and 4 CNN subunits + BLSTM, have 95.18% testing accuracy. This performance is on par with our best-proposed model, VGG + LSTM. By incorporating bidirectional LSTM (BLSTM) into deep learning models, the ability to understand long-term dependencies can be improved. This can enhance accuracy in reading sign language, leading to more effective communication during emergencies.
first_indexed 2024-04-25T01:42:57Z
format Article
id doaj.art-2cadde39cc834ac6992a14a456facbfb
institution Directory Open Access Journal
issn 2442-6571
2548-3161
language English
last_indexed 2024-04-25T01:42:57Z
publishDate 2024-02-01
publisher Universitas Ahmad Dahlan
record_format Article
series IJAIN (International Journal of Advances in Intelligent Informatics)
spelling doaj.art-2cadde39cc834ac6992a14a456facbfb2024-03-08T03:14:05ZengUniversitas Ahmad DahlanIJAIN (International Journal of Advances in Intelligent Informatics)2442-65712548-31612024-02-01101647810.26555/ijain.v10i1.1170281Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) modelsMuhammad Amir As'ari0Nur Anis Jasmin Sufri1Guat Si Qi2Dept of Biomedical Engineering and Health Sciences, Faculty of Electrical Engineering, Universiti Teknologi Malaysia, Johor Bahru, 81310, MalaysiaDepartment of Biomedical Engineering and Health Sciences, Faculty of Electrical Engineering, Universiti Teknologi MalaysiaDepartment of Biomedical Engineering and Health Sciences, Faculty of Electrical Engineering, Universiti Teknologi MalaysiaSign language is the primary communication tool used by the deaf community and people with speaking difficulties, especially during emergencies. Numerous deep learning models have been proposed to solve the sign language recognition problem. Recently. Bidirectional LSTM (BLSTM) has been proposed and used in replacement of Long Short-Term Memory (LSTM) as it may improve learning long-team dependencies as well as increase the accuracy of the model. However, there needs to be more comparison for the performance of LSTM and BLSTM in LRCN model architecture in sign language interpretation applications. Therefore, this study focused on the dense analysis of the LRCN model, including 1) training the CNN from scratch and 2) modeling with pre-trained CNN, VGG-19, and ResNet50. Other than that, the ConvLSTM model, a special variant of LSTM designed for video input, has also been modeled and compared with the LRCN in representing emergency sign language recognition. Within LRCN variants, the performance of a small CNN network was compared with pre-trained VGG-19 and ResNet50V2. A dataset of emergency Indian Sign Language with eight classes is used to train the models. The model with the best performance is the VGG-19 + LSTM model, with a testing accuracy of 96.39%. Small LRCN networks, which are 5 CNN subunits + LSTM and 4 CNN subunits + BLSTM, have 95.18% testing accuracy. This performance is on par with our best-proposed model, VGG + LSTM. By incorporating bidirectional LSTM (BLSTM) into deep learning models, the ability to understand long-term dependencies can be improved. This can enhance accuracy in reading sign language, leading to more effective communication during emergencies.http://ijain.org/index.php/IJAIN/article/view/1170sign languagebidirectional long short term memoryconvolutional neural networks
spellingShingle Muhammad Amir As'ari
Nur Anis Jasmin Sufri
Guat Si Qi
Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models
IJAIN (International Journal of Advances in Intelligent Informatics)
sign language
bidirectional long short term memory
convolutional neural networks
title Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models
title_full Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models
title_fullStr Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models
title_full_unstemmed Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models
title_short Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models
title_sort emergency sign language recognition from variant of convolutional neural network cnn and long short term memory lstm models
topic sign language
bidirectional long short term memory
convolutional neural networks
url http://ijain.org/index.php/IJAIN/article/view/1170
work_keys_str_mv AT muhammadamirasari emergencysignlanguagerecognitionfromvariantofconvolutionalneuralnetworkcnnandlongshorttermmemorylstmmodels
AT nuranisjasminsufri emergencysignlanguagerecognitionfromvariantofconvolutionalneuralnetworkcnnandlongshorttermmemorylstmmodels
AT guatsiqi emergencysignlanguagerecognitionfromvariantofconvolutionalneuralnetworkcnnandlongshorttermmemorylstmmodels