A Spectral–Spatial Transformer Fusion Method for Hyperspectral Video Tracking
Hyperspectral videos (HSVs) can record more adequate detail clues than other videos, which is especially beneficial in cases of abundant spectral information. Although traditional methods based on correlation filters (CFs) employed to explore spectral information locally achieve promising results, t...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-03-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/15/7/1735 |
_version_ | 1797607095406690304 |
---|---|
author | Ye Wang Yuheng Liu Mingyang Ma Shaohui Mei |
author_facet | Ye Wang Yuheng Liu Mingyang Ma Shaohui Mei |
author_sort | Ye Wang |
collection | DOAJ |
description | Hyperspectral videos (HSVs) can record more adequate detail clues than other videos, which is especially beneficial in cases of abundant spectral information. Although traditional methods based on correlation filters (CFs) employed to explore spectral information locally achieve promising results, their performances are limited by ignoring global information. In this paper, a joint spectral–spatial information method, named spectral–spatial transformer-based feature fusion tracker (SSTFT), is proposed for hyperspectral video tracking, which is capable of utilizing spectral–spatial features and considering global interactions. Specifically, the feature extraction module employs two parallel branches to extract multiple-level coarse-grained and fine-grained spectral–spatial features, which are fused with adaptive weights. The extracted features are further fused with the context fusion module based on a transformer with the hyperspectral self-attention (HSA) and hyperspectral cross-attention (HCA), which are designed to capture the self-context feature interaction and the cross-context feature interaction, respectively. Furthermore, an adaptive dynamic template updating strategy is used to update the template bounding box based on the prediction score. The extensive experimental results on benchmark hyperspectral video tracking datasets demonstrated that the proposed SSTFT outperforms the state-of-the-art methods in both precision and speed. |
first_indexed | 2024-03-11T05:26:26Z |
format | Article |
id | doaj.art-149b68b3674947a2a9c27a16e7597a25 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-11T05:26:26Z |
publishDate | 2023-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-149b68b3674947a2a9c27a16e7597a252023-11-17T17:28:09ZengMDPI AGRemote Sensing2072-42922023-03-01157173510.3390/rs15071735A Spectral–Spatial Transformer Fusion Method for Hyperspectral Video TrackingYe Wang0Yuheng Liu1Mingyang Ma2Shaohui Mei3School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710129, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an 710129, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an 710129, ChinaSchool of Electronics and Information, Northwestern Polytechnical University, Xi’an 710129, ChinaHyperspectral videos (HSVs) can record more adequate detail clues than other videos, which is especially beneficial in cases of abundant spectral information. Although traditional methods based on correlation filters (CFs) employed to explore spectral information locally achieve promising results, their performances are limited by ignoring global information. In this paper, a joint spectral–spatial information method, named spectral–spatial transformer-based feature fusion tracker (SSTFT), is proposed for hyperspectral video tracking, which is capable of utilizing spectral–spatial features and considering global interactions. Specifically, the feature extraction module employs two parallel branches to extract multiple-level coarse-grained and fine-grained spectral–spatial features, which are fused with adaptive weights. The extracted features are further fused with the context fusion module based on a transformer with the hyperspectral self-attention (HSA) and hyperspectral cross-attention (HCA), which are designed to capture the self-context feature interaction and the cross-context feature interaction, respectively. Furthermore, an adaptive dynamic template updating strategy is used to update the template bounding box based on the prediction score. The extensive experimental results on benchmark hyperspectral video tracking datasets demonstrated that the proposed SSTFT outperforms the state-of-the-art methods in both precision and speed.https://www.mdpi.com/2072-4292/15/7/1735transformer fusionspectral–spatial jointhyperspectral object tracking |
spellingShingle | Ye Wang Yuheng Liu Mingyang Ma Shaohui Mei A Spectral–Spatial Transformer Fusion Method for Hyperspectral Video Tracking Remote Sensing transformer fusion spectral–spatial joint hyperspectral object tracking |
title | A Spectral–Spatial Transformer Fusion Method for Hyperspectral Video Tracking |
title_full | A Spectral–Spatial Transformer Fusion Method for Hyperspectral Video Tracking |
title_fullStr | A Spectral–Spatial Transformer Fusion Method for Hyperspectral Video Tracking |
title_full_unstemmed | A Spectral–Spatial Transformer Fusion Method for Hyperspectral Video Tracking |
title_short | A Spectral–Spatial Transformer Fusion Method for Hyperspectral Video Tracking |
title_sort | spectral spatial transformer fusion method for hyperspectral video tracking |
topic | transformer fusion spectral–spatial joint hyperspectral object tracking |
url | https://www.mdpi.com/2072-4292/15/7/1735 |
work_keys_str_mv | AT yewang aspectralspatialtransformerfusionmethodforhyperspectralvideotracking AT yuhengliu aspectralspatialtransformerfusionmethodforhyperspectralvideotracking AT mingyangma aspectralspatialtransformerfusionmethodforhyperspectralvideotracking AT shaohuimei aspectralspatialtransformerfusionmethodforhyperspectralvideotracking AT yewang spectralspatialtransformerfusionmethodforhyperspectralvideotracking AT yuhengliu spectralspatialtransformerfusionmethodforhyperspectralvideotracking AT mingyangma spectralspatialtransformerfusionmethodforhyperspectralvideotracking AT shaohuimei spectralspatialtransformerfusionmethodforhyperspectralvideotracking |