Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model
ABSTRACTIn naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2024-04-01
|
Series: | Geo-spatial Information Science |
Subjects: | |
Online Access: | https://www.tandfonline.com/doi/10.1080/10095020.2024.2331552 |
_version_ | 1797204316225798144 |
---|---|
author | Fengxiang Wang Deying Yu Liang Huang Yalun Zhang Yongbing Chen Zhiguo Wang |
author_facet | Fengxiang Wang Deying Yu Liang Huang Yalun Zhang Yongbing Chen Zhiguo Wang |
author_sort | Fengxiang Wang |
collection | DOAJ |
description | ABSTRACTIn naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our pursuit to holistically extract features from ship images across varying granularities, we present a transformative architecture: the Vision Transformer and Multi-Grain Feature Vector Feature Pyramid Network (ViT-MGFV-FPN). This model synergistically melds the merits of MGFV-FPN with an augmented Vision Transformer (ViT) for a comprehensive image feature extraction. To cater to the extraction of broader image features whilst sidestepping the innate quadratic complexity of traditional ViT, we unveil an enhanced version christened the Global Swin Transformer. Concurrently, the MGFV-FPN is orchestrated to harness the prowess of CNNs in distilling intricate ship attributes. Rigorous empirical evaluations underscore our model’s superiority in juxtaposition with extant CNN and transformer-based paradigms for nuanced ship categorization. |
first_indexed | 2024-04-24T08:33:17Z |
format | Article |
id | doaj.art-4db8620323154f98a4389a54e739e757 |
institution | Directory Open Access Journal |
issn | 1009-5020 1993-5153 |
language | English |
last_indexed | 2024-04-24T08:33:17Z |
publishDate | 2024-04-01 |
publisher | Taylor & Francis Group |
record_format | Article |
series | Geo-spatial Information Science |
spelling | doaj.art-4db8620323154f98a4389a54e739e7572024-04-16T18:37:48ZengTaylor & Francis GroupGeo-spatial Information Science1009-50201993-51532024-04-0112210.1080/10095020.2024.2331552Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN modelFengxiang Wang0Deying Yu1Liang Huang2Yalun Zhang3Yongbing Chen4Zhiguo Wang5State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha, ChinaSchool of Electrical Engineering, Naval University of Engineering, Wuhan, ChinaCollege of Electronic Engineering, Naval University of Engineering, Wuhan, ChinaCombat Command Department, People’s Liberation Army Naval Command College, Nanjing, ChinaSchool of Electrical Engineering, Naval University of Engineering, Wuhan, ChinaDepartment of Operational Research and Planning, Naval University of Engineering, Wuhan, ChinaABSTRACTIn naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our pursuit to holistically extract features from ship images across varying granularities, we present a transformative architecture: the Vision Transformer and Multi-Grain Feature Vector Feature Pyramid Network (ViT-MGFV-FPN). This model synergistically melds the merits of MGFV-FPN with an augmented Vision Transformer (ViT) for a comprehensive image feature extraction. To cater to the extraction of broader image features whilst sidestepping the innate quadratic complexity of traditional ViT, we unveil an enhanced version christened the Global Swin Transformer. Concurrently, the MGFV-FPN is orchestrated to harness the prowess of CNNs in distilling intricate ship attributes. Rigorous empirical evaluations underscore our model’s superiority in juxtaposition with extant CNN and transformer-based paradigms for nuanced ship categorization.https://www.tandfonline.com/doi/10.1080/10095020.2024.2331552Deep learningimage classificationship detectionremote-sensing imagestransformer |
spellingShingle | Fengxiang Wang Deying Yu Liang Huang Yalun Zhang Yongbing Chen Zhiguo Wang Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model Geo-spatial Information Science Deep learning image classification ship detection remote-sensing images transformer |
title | Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model |
title_full | Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model |
title_fullStr | Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model |
title_full_unstemmed | Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model |
title_short | Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model |
title_sort | fine grained ship image classification and detection based on a vision transformer and multi grain feature vector fpn model |
topic | Deep learning image classification ship detection remote-sensing images transformer |
url | https://www.tandfonline.com/doi/10.1080/10095020.2024.2331552 |
work_keys_str_mv | AT fengxiangwang finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel AT deyingyu finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel AT lianghuang finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel AT yalunzhang finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel AT yongbingchen finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel AT zhiguowang finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel |