Sign and Human Action Detection Using Deep Learning

Human beings usually rely on communication to express their feeling and ideas and to solve disputes among themselves. A major component required for effective communication is language. Language can occur in different forms, including written symbols, gestures, and vocalizations. It is usually essen...

Full description

Bibliographic Details
Main Authors: Shivanarayna Dhulipala, Festus Fatai Adedoyin, Alessandro Bruno
Format: Article
Language:English
Published: MDPI AG 2022-07-01
Series:Journal of Imaging
Subjects:
Online Access:https://www.mdpi.com/2313-433X/8/7/192
_version_ 1827596879107457024
author Shivanarayna Dhulipala
Festus Fatai Adedoyin
Alessandro Bruno
author_facet Shivanarayna Dhulipala
Festus Fatai Adedoyin
Alessandro Bruno
author_sort Shivanarayna Dhulipala
collection DOAJ
description Human beings usually rely on communication to express their feeling and ideas and to solve disputes among themselves. A major component required for effective communication is language. Language can occur in different forms, including written symbols, gestures, and vocalizations. It is usually essential for all of the communicating parties to be fully conversant with a common language. However, to date this has not been the case between speech-impaired people who use sign language and people who use spoken languages. A number of different studies have pointed out a significant gaps between these two groups which can limit the ease of communication. Therefore, this study aims to develop an efficient deep learning model that can be used to predict British sign language in an attempt to narrow this communication gap between speech-impaired and non-speech-impaired people in the community. Two models were developed in this research, CNN and LSTM, and their performance was evaluated using a multi-class confusion matrix. The CNN model emerged with the highest performance, attaining training and testing accuracies of 98.8% and 97.4%, respectively. In addition, the model achieved average weighted precession and recall of 97% and 96%, respectively. On the other hand, the LSTM model’s performance was quite poor, with the maximum training and testing performance accuracies achieved being 49.4% and 48.7%, respectively. Our research concluded that the CNN model was the best for recognizing and determining British sign language.
first_indexed 2024-03-09T03:18:41Z
format Article
id doaj.art-330cb37dc198480d8cad128f28dedbc7
institution Directory Open Access Journal
issn 2313-433X
language English
last_indexed 2024-03-09T03:18:41Z
publishDate 2022-07-01
publisher MDPI AG
record_format Article
series Journal of Imaging
spelling doaj.art-330cb37dc198480d8cad128f28dedbc72023-12-03T15:14:26ZengMDPI AGJournal of Imaging2313-433X2022-07-018719210.3390/jimaging8070192Sign and Human Action Detection Using Deep LearningShivanarayna Dhulipala0Festus Fatai Adedoyin1Alessandro Bruno2Department of Computing and Informatics, Bournemouth University, Talbot Campus Poole, Poole BH12 5BB, UKDepartment of Computing and Informatics, Bournemouth University, Talbot Campus Poole, Poole BH12 5BB, UKDepartment of Biomedical Sciences, Humanitas University, Via Rita Levi Montalcini 4, Pieve Emanuele, 20072 Milan, ItalyHuman beings usually rely on communication to express their feeling and ideas and to solve disputes among themselves. A major component required for effective communication is language. Language can occur in different forms, including written symbols, gestures, and vocalizations. It is usually essential for all of the communicating parties to be fully conversant with a common language. However, to date this has not been the case between speech-impaired people who use sign language and people who use spoken languages. A number of different studies have pointed out a significant gaps between these two groups which can limit the ease of communication. Therefore, this study aims to develop an efficient deep learning model that can be used to predict British sign language in an attempt to narrow this communication gap between speech-impaired and non-speech-impaired people in the community. Two models were developed in this research, CNN and LSTM, and their performance was evaluated using a multi-class confusion matrix. The CNN model emerged with the highest performance, attaining training and testing accuracies of 98.8% and 97.4%, respectively. In addition, the model achieved average weighted precession and recall of 97% and 96%, respectively. On the other hand, the LSTM model’s performance was quite poor, with the maximum training and testing performance accuracies achieved being 49.4% and 48.7%, respectively. Our research concluded that the CNN model was the best for recognizing and determining British sign language.https://www.mdpi.com/2313-433X/8/7/192CNNLSTMconfusion matrixbritish sign languageprecisionrecall
spellingShingle Shivanarayna Dhulipala
Festus Fatai Adedoyin
Alessandro Bruno
Sign and Human Action Detection Using Deep Learning
Journal of Imaging
CNN
LSTM
confusion matrix
british sign language
precision
recall
title Sign and Human Action Detection Using Deep Learning
title_full Sign and Human Action Detection Using Deep Learning
title_fullStr Sign and Human Action Detection Using Deep Learning
title_full_unstemmed Sign and Human Action Detection Using Deep Learning
title_short Sign and Human Action Detection Using Deep Learning
title_sort sign and human action detection using deep learning
topic CNN
LSTM
confusion matrix
british sign language
precision
recall
url https://www.mdpi.com/2313-433X/8/7/192
work_keys_str_mv AT shivanaraynadhulipala signandhumanactiondetectionusingdeeplearning
AT festusfataiadedoyin signandhumanactiondetectionusingdeeplearning
AT alessandrobruno signandhumanactiondetectionusingdeeplearning