Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model

ABSTRACTIn naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our...

Full description

Bibliographic Details
Main Authors: Fengxiang Wang, Deying Yu, Liang Huang, Yalun Zhang, Yongbing Chen, Zhiguo Wang
Format: Article
Language:English
Published: Taylor & Francis Group 2024-04-01
Series:Geo-spatial Information Science
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/10095020.2024.2331552
_version_ 1797204316225798144
author Fengxiang Wang
Deying Yu
Liang Huang
Yalun Zhang
Yongbing Chen
Zhiguo Wang
author_facet Fengxiang Wang
Deying Yu
Liang Huang
Yalun Zhang
Yongbing Chen
Zhiguo Wang
author_sort Fengxiang Wang
collection DOAJ
description ABSTRACTIn naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our pursuit to holistically extract features from ship images across varying granularities, we present a transformative architecture: the Vision Transformer and Multi-Grain Feature Vector Feature Pyramid Network (ViT-MGFV-FPN). This model synergistically melds the merits of MGFV-FPN with an augmented Vision Transformer (ViT) for a comprehensive image feature extraction. To cater to the extraction of broader image features whilst sidestepping the innate quadratic complexity of traditional ViT, we unveil an enhanced version christened the Global Swin Transformer. Concurrently, the MGFV-FPN is orchestrated to harness the prowess of CNNs in distilling intricate ship attributes. Rigorous empirical evaluations underscore our model’s superiority in juxtaposition with extant CNN and transformer-based paradigms for nuanced ship categorization.
first_indexed 2024-04-24T08:33:17Z
format Article
id doaj.art-4db8620323154f98a4389a54e739e757
institution Directory Open Access Journal
issn 1009-5020
1993-5153
language English
last_indexed 2024-04-24T08:33:17Z
publishDate 2024-04-01
publisher Taylor & Francis Group
record_format Article
series Geo-spatial Information Science
spelling doaj.art-4db8620323154f98a4389a54e739e7572024-04-16T18:37:48ZengTaylor & Francis GroupGeo-spatial Information Science1009-50201993-51532024-04-0112210.1080/10095020.2024.2331552Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN modelFengxiang Wang0Deying Yu1Liang Huang2Yalun Zhang3Yongbing Chen4Zhiguo Wang5State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha, ChinaSchool of Electrical Engineering, Naval University of Engineering, Wuhan, ChinaCollege of Electronic Engineering, Naval University of Engineering, Wuhan, ChinaCombat Command Department, People’s Liberation Army Naval Command College, Nanjing, ChinaSchool of Electrical Engineering, Naval University of Engineering, Wuhan, ChinaDepartment of Operational Research and Planning, Naval University of Engineering, Wuhan, ChinaABSTRACTIn naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our pursuit to holistically extract features from ship images across varying granularities, we present a transformative architecture: the Vision Transformer and Multi-Grain Feature Vector Feature Pyramid Network (ViT-MGFV-FPN). This model synergistically melds the merits of MGFV-FPN with an augmented Vision Transformer (ViT) for a comprehensive image feature extraction. To cater to the extraction of broader image features whilst sidestepping the innate quadratic complexity of traditional ViT, we unveil an enhanced version christened the Global Swin Transformer. Concurrently, the MGFV-FPN is orchestrated to harness the prowess of CNNs in distilling intricate ship attributes. Rigorous empirical evaluations underscore our model’s superiority in juxtaposition with extant CNN and transformer-based paradigms for nuanced ship categorization.https://www.tandfonline.com/doi/10.1080/10095020.2024.2331552Deep learningimage classificationship detectionremote-sensing imagestransformer
spellingShingle Fengxiang Wang
Deying Yu
Liang Huang
Yalun Zhang
Yongbing Chen
Zhiguo Wang
Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model
Geo-spatial Information Science
Deep learning
image classification
ship detection
remote-sensing images
transformer
title Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model
title_full Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model
title_fullStr Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model
title_full_unstemmed Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model
title_short Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model
title_sort fine grained ship image classification and detection based on a vision transformer and multi grain feature vector fpn model
topic Deep learning
image classification
ship detection
remote-sensing images
transformer
url https://www.tandfonline.com/doi/10.1080/10095020.2024.2331552
work_keys_str_mv AT fengxiangwang finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel
AT deyingyu finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel
AT lianghuang finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel
AT yalunzhang finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel
AT yongbingchen finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel
AT zhiguowang finegrainedshipimageclassificationanddetectionbasedonavisiontransformerandmultigrainfeaturevectorfpnmodel