Land Cover Classification for Polarimetric SAR Images Based on Vision Transformer

Deep learning methods have been widely studied for Polarimetric synthetic aperture radar (PolSAR) land cover classification. The scarcity of PolSAR labeled samples and the small receptive field of the model limit the performance of deep learning methods for land cover classification. In this paper,...

Full description

Bibliographic Details
Main Authors: Hongmiao Wang, Cheng Xing, Junjun Yin, Jian Yang
Format: Article
Language:English
Published: MDPI AG 2022-09-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/18/4656
Description
Summary:Deep learning methods have been widely studied for Polarimetric synthetic aperture radar (PolSAR) land cover classification. The scarcity of PolSAR labeled samples and the small receptive field of the model limit the performance of deep learning methods for land cover classification. In this paper, a vision Transformer (ViT)-based classification method is proposed. The ViT structure can extract features from the global range of images based on a self-attention block. The powerful feature representation capability of the model is equivalent to a flexible receptive field, which is suitable for PolSAR image classification at different resolutions. In addition, because of the lack of labeled data, the Mask Autoencoder method is used to pre-train the proposed model with unlabeled data. Experiments are carried out on the Flevoland dataset acquired by NASA/JPL AIRSAR and the Hainan dataset acquired by the Aerial Remote Sensing System of the Chinese Academy of Sciences. The experimental results on both datasets demonstrate the superiority of the proposed method.
ISSN:2072-4292