An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video Motion

Sign language is the primary communication medium for persons with hearing impairments. This language depends mainly on hand articulations accompanied by nonmanual gestures. Recently, there has been a growing interest in sign language recognition. In this paper, we propose a trainable deep learning...

Full description

Bibliographic Details
Main Author: Hamzah Luqman
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9875269/
_version_ 1798033327350874112
author Hamzah Luqman
author_facet Hamzah Luqman
author_sort Hamzah Luqman
collection DOAJ
description Sign language is the primary communication medium for persons with hearing impairments. This language depends mainly on hand articulations accompanied by nonmanual gestures. Recently, there has been a growing interest in sign language recognition. In this paper, we propose a trainable deep learning network for isolated sign language recognition, which can effectively capture the spatiotemporal information using a small number of signs&#x2019; frames. We propose a hierarchical sign learning module that comprises three networks: dynamic motion network (DMN), accumulative motion network (AMN), and sign recognition network (SRN). Additionally, we propose a technique to extract key postures for handling the variations in the sign samples performed by different signers. The DMN stream uses these key postures to learn the spatiotemporal information pertaining to the signs. We also propose a novel technique to represent the statical and dynamic information of sign gestures into a single frame. This approach preserves the spatial and temporal information of the sign by fusing the sign&#x2019;s key postures in the forward and backward directions to generate an accumulative video motion frame. This frame was used as an input to the AMN stream, and the extracted features were fused with the DMN features to be fed into the SRN for the learning and classification of signs. The proposed approach is efficient for isolated sign language recognition, especially for recognizing static signs. We evaluated this approach on the KArSL-190 and KArSL-502 Arabic sign language datasets, and the obtained results on KArSL-190 outperformed other techniques by 15&#x0025; in the signer-independent mode. Additionally, the proposed approach outperformed the state-of-the-art techniques on the Argentinian sign language dataset LSA64. The code is available at <uri>https://github.com/Hamzah-Luqman/SLR_AMN</uri>.
first_indexed 2024-04-11T20:28:36Z
format Article
id doaj.art-c70bdc46689645628c82ea4bc660a660
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T20:28:36Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-c70bdc46689645628c82ea4bc660a6602022-12-22T04:04:36ZengIEEEIEEE Access2169-35362022-01-0110937859379810.1109/ACCESS.2022.32041109875269An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video MotionHamzah Luqman0https://orcid.org/0000-0001-7944-5093Information and Computer Science Department, College of Computing and Mathematics, King Fahd University of Petroleum &#x0026; Minerals, Dhahran, Saudi ArabiaSign language is the primary communication medium for persons with hearing impairments. This language depends mainly on hand articulations accompanied by nonmanual gestures. Recently, there has been a growing interest in sign language recognition. In this paper, we propose a trainable deep learning network for isolated sign language recognition, which can effectively capture the spatiotemporal information using a small number of signs&#x2019; frames. We propose a hierarchical sign learning module that comprises three networks: dynamic motion network (DMN), accumulative motion network (AMN), and sign recognition network (SRN). Additionally, we propose a technique to extract key postures for handling the variations in the sign samples performed by different signers. The DMN stream uses these key postures to learn the spatiotemporal information pertaining to the signs. We also propose a novel technique to represent the statical and dynamic information of sign gestures into a single frame. This approach preserves the spatial and temporal information of the sign by fusing the sign&#x2019;s key postures in the forward and backward directions to generate an accumulative video motion frame. This frame was used as an input to the AMN stream, and the extracted features were fused with the DMN features to be fed into the SRN for the learning and classification of signs. The proposed approach is efficient for isolated sign language recognition, especially for recognizing static signs. We evaluated this approach on the KArSL-190 and KArSL-502 Arabic sign language datasets, and the obtained results on KArSL-190 outperformed other techniques by 15&#x0025; in the signer-independent mode. Additionally, the proposed approach outperformed the state-of-the-art techniques on the Argentinian sign language dataset LSA64. The code is available at <uri>https://github.com/Hamzah-Luqman/SLR_AMN</uri>.https://ieeexplore.ieee.org/document/9875269/Sign language recognitionArabic sign languageArgentinian sign languageKArSLLSA64gesture recognition
spellingShingle Hamzah Luqman
An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video Motion
IEEE Access
Sign language recognition
Arabic sign language
Argentinian sign language
KArSL
LSA64
gesture recognition
title An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video Motion
title_full An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video Motion
title_fullStr An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video Motion
title_full_unstemmed An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video Motion
title_short An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video Motion
title_sort efficient two stream network for isolated sign language recognition using accumulative video motion
topic Sign language recognition
Arabic sign language
Argentinian sign language
KArSL
LSA64
gesture recognition
url https://ieeexplore.ieee.org/document/9875269/
work_keys_str_mv AT hamzahluqman anefficienttwostreamnetworkforisolatedsignlanguagerecognitionusingaccumulativevideomotion
AT hamzahluqman efficienttwostreamnetworkforisolatedsignlanguagerecognitionusingaccumulativevideomotion