Transformer Meets Remote Sensing Video Detection and Tracking: A Comprehensive Survey
Transformer has shown excellent performance in remote sensing field with long-range modeling capabilities. Remote sensing video (RSV) moving object detection and tracking play indispensable roles in military activities as well as urban monitoring. However, transformers in these fields are still at t...
Main Authors: | , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10163641/ |
_version_ | 1827362049610481664 |
---|---|
author | Licheng Jiao Xin Zhang Xu Liu Fang Liu Shuyuan Yang Wenping Ma Lingling Li Puhua Chen Zhixi Feng Yuwei Guo Xu Tang Biao Hou Xiangrong Zhang Jing Bai Dou Quan Junpeng Zhang |
author_facet | Licheng Jiao Xin Zhang Xu Liu Fang Liu Shuyuan Yang Wenping Ma Lingling Li Puhua Chen Zhixi Feng Yuwei Guo Xu Tang Biao Hou Xiangrong Zhang Jing Bai Dou Quan Junpeng Zhang |
author_sort | Licheng Jiao |
collection | DOAJ |
description | Transformer has shown excellent performance in remote sensing field with long-range modeling capabilities. Remote sensing video (RSV) moving object detection and tracking play indispensable roles in military activities as well as urban monitoring. However, transformers in these fields are still at the exploratory stage. In this survey, we comprehensively summarize the research prospects of transformers in RSV moving object detection and tracking. The core designs of remote sensing transformers and advanced transformers are first analyzed. It mainly includes the attention mechanism evolution for specific tasks, the fitting ability design of input mapping, diverse feature representation, model optimization, etc. The architectural characteristics of RSV detection and tracking are then described across two aspects. One is moving object detection for motion-based traditional background subtractions and appearance-based deep learning models. The other is object tracking for single and multiple targets. The research difficulties mainly include the blurred foreground in RSV data, the irregular object movement in traditional background subtraction, and the severe object occlusion in object tracking. Following that, the potential significance of transformers is discussed according to some thorny problems in RSV. Finally, we summarize ten open challenges of transformers in RSV, which may be used as a reference for promoting future research. |
first_indexed | 2024-03-08T07:19:22Z |
format | Article |
id | doaj.art-6c32c77b7fdb454e8ac081b0ca96d781 |
institution | Directory Open Access Journal |
issn | 2151-1535 |
language | English |
last_indexed | 2024-03-08T07:19:22Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
spelling | doaj.art-6c32c77b7fdb454e8ac081b0ca96d7812024-02-03T00:01:16ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352023-01-011614510.1109/JSTARS.2023.328929310163641Transformer Meets Remote Sensing Video Detection and Tracking: A Comprehensive SurveyLicheng Jiao0https://orcid.org/0000-0003-3354-9617Xin Zhang1https://orcid.org/0000-0002-0296-0393Xu Liu2https://orcid.org/0000-0002-8780-5455Fang Liu3https://orcid.org/0000-0002-5669-9354Shuyuan Yang4https://orcid.org/0000-0002-4796-5737Wenping Ma5https://orcid.org/0000-0001-8872-2195Lingling Li6https://orcid.org/0000-0002-6130-2518Puhua Chen7https://orcid.org/0000-0001-5472-1426Zhixi Feng8https://orcid.org/0000-0002-7372-9180Yuwei Guo9https://orcid.org/0000-0002-6095-8830Xu Tang10https://orcid.org/0000-0003-1375-0778Biao Hou11https://orcid.org/0000-0002-1996-186XXiangrong Zhang12https://orcid.org/0000-0003-0379-2042Jing Bai13https://orcid.org/0000-0001-5412-7793Dou Quan14https://orcid.org/0000-0001-6943-4657Junpeng Zhang15https://orcid.org/0000-0001-8068-6767Key Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, International Research Center of Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi'an, ChinaTransformer has shown excellent performance in remote sensing field with long-range modeling capabilities. Remote sensing video (RSV) moving object detection and tracking play indispensable roles in military activities as well as urban monitoring. However, transformers in these fields are still at the exploratory stage. In this survey, we comprehensively summarize the research prospects of transformers in RSV moving object detection and tracking. The core designs of remote sensing transformers and advanced transformers are first analyzed. It mainly includes the attention mechanism evolution for specific tasks, the fitting ability design of input mapping, diverse feature representation, model optimization, etc. The architectural characteristics of RSV detection and tracking are then described across two aspects. One is moving object detection for motion-based traditional background subtractions and appearance-based deep learning models. The other is object tracking for single and multiple targets. The research difficulties mainly include the blurred foreground in RSV data, the irregular object movement in traditional background subtraction, and the severe object occlusion in object tracking. Following that, the potential significance of transformers is discussed according to some thorny problems in RSV. Finally, we summarize ten open challenges of transformers in RSV, which may be used as a reference for promoting future research.https://ieeexplore.ieee.org/document/10163641/Remote sensing (RS)transformervideo signal processing |
spellingShingle | Licheng Jiao Xin Zhang Xu Liu Fang Liu Shuyuan Yang Wenping Ma Lingling Li Puhua Chen Zhixi Feng Yuwei Guo Xu Tang Biao Hou Xiangrong Zhang Jing Bai Dou Quan Junpeng Zhang Transformer Meets Remote Sensing Video Detection and Tracking: A Comprehensive Survey IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Remote sensing (RS) transformer video signal processing |
title | Transformer Meets Remote Sensing Video Detection and Tracking: A Comprehensive Survey |
title_full | Transformer Meets Remote Sensing Video Detection and Tracking: A Comprehensive Survey |
title_fullStr | Transformer Meets Remote Sensing Video Detection and Tracking: A Comprehensive Survey |
title_full_unstemmed | Transformer Meets Remote Sensing Video Detection and Tracking: A Comprehensive Survey |
title_short | Transformer Meets Remote Sensing Video Detection and Tracking: A Comprehensive Survey |
title_sort | transformer meets remote sensing video detection and tracking a comprehensive survey |
topic | Remote sensing (RS) transformer video signal processing |
url | https://ieeexplore.ieee.org/document/10163641/ |
work_keys_str_mv | AT lichengjiao transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT xinzhang transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT xuliu transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT fangliu transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT shuyuanyang transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT wenpingma transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT linglingli transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT puhuachen transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT zhixifeng transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT yuweiguo transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT xutang transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT biaohou transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT xiangrongzhang transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT jingbai transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT douquan transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey AT junpengzhang transformermeetsremotesensingvideodetectionandtrackingacomprehensivesurvey |