DP-ViT: A Dual-Path Vision Transformer for Real-Time Sonar Target Detection

Sonar image is the main way for underwater vehicles to obtain environmental information. The task of target detection in sonar images can distinguish multi-class targets in real time and accurately locate them, providing perception information for the decision-making system of underwater vehicles. H...

Full description

Bibliographic Details
Main Authors: Yushan Sun, Haotian Zheng, Guocheng Zhang, Jingfei Ren, Hao Xu, Chao Xu
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/22/5807
_version_ 1797464045863829504
author Yushan Sun
Haotian Zheng
Guocheng Zhang
Jingfei Ren
Hao Xu
Chao Xu
author_facet Yushan Sun
Haotian Zheng
Guocheng Zhang
Jingfei Ren
Hao Xu
Chao Xu
author_sort Yushan Sun
collection DOAJ
description Sonar image is the main way for underwater vehicles to obtain environmental information. The task of target detection in sonar images can distinguish multi-class targets in real time and accurately locate them, providing perception information for the decision-making system of underwater vehicles. However, there are many challenges in sonar image target detection, such as many kinds of sonar, complex and serious noise interference in images, and less datasets. This paper proposes a sonar image target detection method based on Dual Path Vision Transformer Network (DP-VIT) to accurately detect targets in forward-look sonar and side-scan sonar. DP-ViT increases receptive field by adding multi-scale to patch embedding enhances learning ability of model feature extraction by using Dual Path Transformer Block, then introduces Conv-Attention to reduce model training parameters, and finally uses Generalized Focal Loss to solve the problem of imbalance between positive and negative samples. The experimental results show that the performance of this sonar target detection method is superior to other mainstream methods on both forward-look sonar dataset and side-scan sonar dataset, and it can also maintain good performance in the case of adding noise.
first_indexed 2024-03-09T18:02:21Z
format Article
id doaj.art-c3ec73609f7f4a72ba340496f13aa5c0
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-09T18:02:21Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-c3ec73609f7f4a72ba340496f13aa5c02023-11-24T09:50:49ZengMDPI AGRemote Sensing2072-42922022-11-011422580710.3390/rs14225807DP-ViT: A Dual-Path Vision Transformer for Real-Time Sonar Target DetectionYushan Sun0Haotian Zheng1Guocheng Zhang2Jingfei Ren3Hao Xu4Chao Xu5Science and Technology on Underwater Vehicle Laboratory, Harbin Engineering University, Harbin 150001, ChinaScience and Technology on Underwater Vehicle Laboratory, Harbin Engineering University, Harbin 150001, ChinaScience and Technology on Underwater Vehicle Laboratory, Harbin Engineering University, Harbin 150001, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, ChinaMarine Design and Research Institute of China, Shanghai 200011, ChinaCollege of Underwater Acoustic Engineering, Harbin Engineering University, Harbin 150001, ChinaSonar image is the main way for underwater vehicles to obtain environmental information. The task of target detection in sonar images can distinguish multi-class targets in real time and accurately locate them, providing perception information for the decision-making system of underwater vehicles. However, there are many challenges in sonar image target detection, such as many kinds of sonar, complex and serious noise interference in images, and less datasets. This paper proposes a sonar image target detection method based on Dual Path Vision Transformer Network (DP-VIT) to accurately detect targets in forward-look sonar and side-scan sonar. DP-ViT increases receptive field by adding multi-scale to patch embedding enhances learning ability of model feature extraction by using Dual Path Transformer Block, then introduces Conv-Attention to reduce model training parameters, and finally uses Generalized Focal Loss to solve the problem of imbalance between positive and negative samples. The experimental results show that the performance of this sonar target detection method is superior to other mainstream methods on both forward-look sonar dataset and side-scan sonar dataset, and it can also maintain good performance in the case of adding noise.https://www.mdpi.com/2072-4292/14/22/5807sonar target detectionvision transformertransformerconvolutional neural networkAUV environment awareness
spellingShingle Yushan Sun
Haotian Zheng
Guocheng Zhang
Jingfei Ren
Hao Xu
Chao Xu
DP-ViT: A Dual-Path Vision Transformer for Real-Time Sonar Target Detection
Remote Sensing
sonar target detection
vision transformer
transformer
convolutional neural network
AUV environment awareness
title DP-ViT: A Dual-Path Vision Transformer for Real-Time Sonar Target Detection
title_full DP-ViT: A Dual-Path Vision Transformer for Real-Time Sonar Target Detection
title_fullStr DP-ViT: A Dual-Path Vision Transformer for Real-Time Sonar Target Detection
title_full_unstemmed DP-ViT: A Dual-Path Vision Transformer for Real-Time Sonar Target Detection
title_short DP-ViT: A Dual-Path Vision Transformer for Real-Time Sonar Target Detection
title_sort dp vit a dual path vision transformer for real time sonar target detection
topic sonar target detection
vision transformer
transformer
convolutional neural network
AUV environment awareness
url https://www.mdpi.com/2072-4292/14/22/5807
work_keys_str_mv AT yushansun dpvitadualpathvisiontransformerforrealtimesonartargetdetection
AT haotianzheng dpvitadualpathvisiontransformerforrealtimesonartargetdetection
AT guochengzhang dpvitadualpathvisiontransformerforrealtimesonartargetdetection
AT jingfeiren dpvitadualpathvisiontransformerforrealtimesonartargetdetection
AT haoxu dpvitadualpathvisiontransformerforrealtimesonartargetdetection
AT chaoxu dpvitadualpathvisiontransformerforrealtimesonartargetdetection