An Underwater Human–Robot Interaction Using a Visual–Textual Model for Autonomous Underwater Vehicles
The marine environment presents a unique set of challenges for human–robot interaction. Communicating with gestures is a common way for interacting between the diver and autonomous underwater vehicles (AUVs). However, underwater gesture recognition is a challenging visual task for AUVs due to light...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-12-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/23/1/197 |
_version_ | 1797439653467389952 |
---|---|
author | Yongji Zhang Yu Jiang Hong Qi Minghao Zhao Yuehang Wang Kai Wang Fenglin Wei |
author_facet | Yongji Zhang Yu Jiang Hong Qi Minghao Zhao Yuehang Wang Kai Wang Fenglin Wei |
author_sort | Yongji Zhang |
collection | DOAJ |
description | The marine environment presents a unique set of challenges for human–robot interaction. Communicating with gestures is a common way for interacting between the diver and autonomous underwater vehicles (AUVs). However, underwater gesture recognition is a challenging visual task for AUVs due to light refraction and wavelength color attenuation issues. Current gesture recognition methods classify the whole image directly or locate the hand position first and then classify the hand features. Among these purely visual approaches, textual information is largely ignored. This paper proposes a visual–textual model for underwater hand gesture recognition (VT-UHGR). The VT-UHGR model encodes the underwater diver’s image as visual features, the category text as textual features, and generates visual–textual features through multimodal interactions. We guide AUVs to use image–text matching for learning and inference. The proposed method achieves better performance than most existing purely visual methods on the dataset CADDY, demonstrating the effectiveness of using textual patterns for underwater gesture recognition. |
first_indexed | 2024-03-09T11:56:14Z |
format | Article |
id | doaj.art-108e43af8e7941739a0324ee7c6397c2 |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-03-09T11:56:14Z |
publishDate | 2022-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-108e43af8e7941739a0324ee7c6397c22023-11-30T23:08:05ZengMDPI AGSensors1424-82202022-12-0123119710.3390/s23010197An Underwater Human–Robot Interaction Using a Visual–Textual Model for Autonomous Underwater VehiclesYongji Zhang0Yu Jiang1Hong Qi2Minghao Zhao3Yuehang Wang4Kai Wang5Fenglin Wei6College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaThe marine environment presents a unique set of challenges for human–robot interaction. Communicating with gestures is a common way for interacting between the diver and autonomous underwater vehicles (AUVs). However, underwater gesture recognition is a challenging visual task for AUVs due to light refraction and wavelength color attenuation issues. Current gesture recognition methods classify the whole image directly or locate the hand position first and then classify the hand features. Among these purely visual approaches, textual information is largely ignored. This paper proposes a visual–textual model for underwater hand gesture recognition (VT-UHGR). The VT-UHGR model encodes the underwater diver’s image as visual features, the category text as textual features, and generates visual–textual features through multimodal interactions. We guide AUVs to use image–text matching for learning and inference. The proposed method achieves better performance than most existing purely visual methods on the dataset CADDY, demonstrating the effectiveness of using textual patterns for underwater gesture recognition.https://www.mdpi.com/1424-8220/23/1/197autonomous underwater vehicleunderwater human–robot interactiongesture recognitionvisual–textual association |
spellingShingle | Yongji Zhang Yu Jiang Hong Qi Minghao Zhao Yuehang Wang Kai Wang Fenglin Wei An Underwater Human–Robot Interaction Using a Visual–Textual Model for Autonomous Underwater Vehicles Sensors autonomous underwater vehicle underwater human–robot interaction gesture recognition visual–textual association |
title | An Underwater Human–Robot Interaction Using a Visual–Textual Model for Autonomous Underwater Vehicles |
title_full | An Underwater Human–Robot Interaction Using a Visual–Textual Model for Autonomous Underwater Vehicles |
title_fullStr | An Underwater Human–Robot Interaction Using a Visual–Textual Model for Autonomous Underwater Vehicles |
title_full_unstemmed | An Underwater Human–Robot Interaction Using a Visual–Textual Model for Autonomous Underwater Vehicles |
title_short | An Underwater Human–Robot Interaction Using a Visual–Textual Model for Autonomous Underwater Vehicles |
title_sort | underwater human robot interaction using a visual textual model for autonomous underwater vehicles |
topic | autonomous underwater vehicle underwater human–robot interaction gesture recognition visual–textual association |
url | https://www.mdpi.com/1424-8220/23/1/197 |
work_keys_str_mv | AT yongjizhang anunderwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT yujiang anunderwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT hongqi anunderwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT minghaozhao anunderwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT yuehangwang anunderwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT kaiwang anunderwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT fenglinwei anunderwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT yongjizhang underwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT yujiang underwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT hongqi underwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT minghaozhao underwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT yuehangwang underwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT kaiwang underwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles AT fenglinwei underwaterhumanrobotinteractionusingavisualtextualmodelforautonomousunderwatervehicles |