Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models
Sign language is the primary communication tool used by the deaf community and people with speaking difficulties, especially during emergencies. Numerous deep learning models have been proposed to solve the sign language recognition problem. Recently. Bidirectional LSTM (BLSTM) has been proposed and...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Universitas Ahmad Dahlan
2024-02-01
|
Series: | IJAIN (International Journal of Advances in Intelligent Informatics) |
Subjects: | |
Online Access: | http://ijain.org/index.php/IJAIN/article/view/1170 |
_version_ | 1797269097537339392 |
---|---|
author | Muhammad Amir As'ari Nur Anis Jasmin Sufri Guat Si Qi |
author_facet | Muhammad Amir As'ari Nur Anis Jasmin Sufri Guat Si Qi |
author_sort | Muhammad Amir As'ari |
collection | DOAJ |
description | Sign language is the primary communication tool used by the deaf community and people with speaking difficulties, especially during emergencies. Numerous deep learning models have been proposed to solve the sign language recognition problem. Recently. Bidirectional LSTM (BLSTM) has been proposed and used in replacement of Long Short-Term Memory (LSTM) as it may improve learning long-team dependencies as well as increase the accuracy of the model. However, there needs to be more comparison for the performance of LSTM and BLSTM in LRCN model architecture in sign language interpretation applications. Therefore, this study focused on the dense analysis of the LRCN model, including 1) training the CNN from scratch and 2) modeling with pre-trained CNN, VGG-19, and ResNet50. Other than that, the ConvLSTM model, a special variant of LSTM designed for video input, has also been modeled and compared with the LRCN in representing emergency sign language recognition. Within LRCN variants, the performance of a small CNN network was compared with pre-trained VGG-19 and ResNet50V2. A dataset of emergency Indian Sign Language with eight classes is used to train the models. The model with the best performance is the VGG-19 + LSTM model, with a testing accuracy of 96.39%. Small LRCN networks, which are 5 CNN subunits + LSTM and 4 CNN subunits + BLSTM, have 95.18% testing accuracy. This performance is on par with our best-proposed model, VGG + LSTM. By incorporating bidirectional LSTM (BLSTM) into deep learning models, the ability to understand long-term dependencies can be improved. This can enhance accuracy in reading sign language, leading to more effective communication during emergencies. |
first_indexed | 2024-04-25T01:42:57Z |
format | Article |
id | doaj.art-2cadde39cc834ac6992a14a456facbfb |
institution | Directory Open Access Journal |
issn | 2442-6571 2548-3161 |
language | English |
last_indexed | 2024-04-25T01:42:57Z |
publishDate | 2024-02-01 |
publisher | Universitas Ahmad Dahlan |
record_format | Article |
series | IJAIN (International Journal of Advances in Intelligent Informatics) |
spelling | doaj.art-2cadde39cc834ac6992a14a456facbfb2024-03-08T03:14:05ZengUniversitas Ahmad DahlanIJAIN (International Journal of Advances in Intelligent Informatics)2442-65712548-31612024-02-01101647810.26555/ijain.v10i1.1170281Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) modelsMuhammad Amir As'ari0Nur Anis Jasmin Sufri1Guat Si Qi2Dept of Biomedical Engineering and Health Sciences, Faculty of Electrical Engineering, Universiti Teknologi Malaysia, Johor Bahru, 81310, MalaysiaDepartment of Biomedical Engineering and Health Sciences, Faculty of Electrical Engineering, Universiti Teknologi MalaysiaDepartment of Biomedical Engineering and Health Sciences, Faculty of Electrical Engineering, Universiti Teknologi MalaysiaSign language is the primary communication tool used by the deaf community and people with speaking difficulties, especially during emergencies. Numerous deep learning models have been proposed to solve the sign language recognition problem. Recently. Bidirectional LSTM (BLSTM) has been proposed and used in replacement of Long Short-Term Memory (LSTM) as it may improve learning long-team dependencies as well as increase the accuracy of the model. However, there needs to be more comparison for the performance of LSTM and BLSTM in LRCN model architecture in sign language interpretation applications. Therefore, this study focused on the dense analysis of the LRCN model, including 1) training the CNN from scratch and 2) modeling with pre-trained CNN, VGG-19, and ResNet50. Other than that, the ConvLSTM model, a special variant of LSTM designed for video input, has also been modeled and compared with the LRCN in representing emergency sign language recognition. Within LRCN variants, the performance of a small CNN network was compared with pre-trained VGG-19 and ResNet50V2. A dataset of emergency Indian Sign Language with eight classes is used to train the models. The model with the best performance is the VGG-19 + LSTM model, with a testing accuracy of 96.39%. Small LRCN networks, which are 5 CNN subunits + LSTM and 4 CNN subunits + BLSTM, have 95.18% testing accuracy. This performance is on par with our best-proposed model, VGG + LSTM. By incorporating bidirectional LSTM (BLSTM) into deep learning models, the ability to understand long-term dependencies can be improved. This can enhance accuracy in reading sign language, leading to more effective communication during emergencies.http://ijain.org/index.php/IJAIN/article/view/1170sign languagebidirectional long short term memoryconvolutional neural networks |
spellingShingle | Muhammad Amir As'ari Nur Anis Jasmin Sufri Guat Si Qi Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models IJAIN (International Journal of Advances in Intelligent Informatics) sign language bidirectional long short term memory convolutional neural networks |
title | Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models |
title_full | Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models |
title_fullStr | Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models |
title_full_unstemmed | Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models |
title_short | Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models |
title_sort | emergency sign language recognition from variant of convolutional neural network cnn and long short term memory lstm models |
topic | sign language bidirectional long short term memory convolutional neural networks |
url | http://ijain.org/index.php/IJAIN/article/view/1170 |
work_keys_str_mv | AT muhammadamirasari emergencysignlanguagerecognitionfromvariantofconvolutionalneuralnetworkcnnandlongshorttermmemorylstmmodels AT nuranisjasminsufri emergencysignlanguagerecognitionfromvariantofconvolutionalneuralnetworkcnnandlongshorttermmemorylstmmodels AT guatsiqi emergencysignlanguagerecognitionfromvariantofconvolutionalneuralnetworkcnnandlongshorttermmemorylstmmodels |