CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images

Semantic segmentation by using remote sensing images is an efficient method for agricultural crop classification. Recent solutions in crop segmentation are mainly deep-learning-based methods, including two mainstream architectures: Convolutional Neural Networks (CNNs) and Transformer. However, these...

Full description

Bibliographic Details
Main Authors: Hong Wang, Xianzhong Chen, Tianxiang Zhang, Zhiyong Xu, Jiangyun Li
Format: Article
Language:English
Published: MDPI AG 2022-04-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/9/1956
_version_ 1797503116642353152
author Hong Wang
Xianzhong Chen
Tianxiang Zhang
Zhiyong Xu
Jiangyun Li
author_facet Hong Wang
Xianzhong Chen
Tianxiang Zhang
Zhiyong Xu
Jiangyun Li
author_sort Hong Wang
collection DOAJ
description Semantic segmentation by using remote sensing images is an efficient method for agricultural crop classification. Recent solutions in crop segmentation are mainly deep-learning-based methods, including two mainstream architectures: Convolutional Neural Networks (CNNs) and Transformer. However, these two architectures are not sufficiently good for the crop segmentation task due to the following three reasons. First, the ultra-high-resolution images need to be cut into small patches before processing, which leads to the incomplete structure of different categories’ edges. Second, because of the deficiency of global information, categories inside the crop field may be wrongly classified. Third, to restore complete images, the patches need to be spliced together, causing the edge artifacts and small misclassified objects and holes. Therefore, we proposed a novel architecture named the Coupled CNN and Transformer Network (CCTNet), which combines the local details (e.g., edge and texture) by the CNN and global context by Transformer to cope with the aforementioned problems. In particular, two modules, namely the Light Adaptive Fusion Module (LAFM) and the Coupled Attention Fusion Module (CAFM), are also designed to efficiently fuse these advantages. Meanwhile, three effective methods named Overlapping Sliding Window (OSW), Testing Time Augmentation (TTA), and Post-Processing (PP) are proposed to remove small objects and holes embedded in the inference stage and restore complete images. The experimental results evaluated on the Barley Remote Sensing Dataset present that the CCTNet outperformed the single CNN or Transformer methods, achieving 72.97% mean Intersection over Union (mIoU) scores. As a consequence, it is believed that the proposed CCTNet can be a competitive method for crop segmentation by remote sensing images.
first_indexed 2024-03-10T03:45:55Z
format Article
id doaj.art-6fc6b8e76fa6448ca466b8b8266d901e
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T03:45:55Z
publishDate 2022-04-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-6fc6b8e76fa6448ca466b8b8266d901e2023-11-23T09:08:25ZengMDPI AGRemote Sensing2072-42922022-04-01149195610.3390/rs14091956CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing ImagesHong Wang0Xianzhong Chen1Tianxiang Zhang2Zhiyong Xu3Jiangyun Li4School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSemantic segmentation by using remote sensing images is an efficient method for agricultural crop classification. Recent solutions in crop segmentation are mainly deep-learning-based methods, including two mainstream architectures: Convolutional Neural Networks (CNNs) and Transformer. However, these two architectures are not sufficiently good for the crop segmentation task due to the following three reasons. First, the ultra-high-resolution images need to be cut into small patches before processing, which leads to the incomplete structure of different categories’ edges. Second, because of the deficiency of global information, categories inside the crop field may be wrongly classified. Third, to restore complete images, the patches need to be spliced together, causing the edge artifacts and small misclassified objects and holes. Therefore, we proposed a novel architecture named the Coupled CNN and Transformer Network (CCTNet), which combines the local details (e.g., edge and texture) by the CNN and global context by Transformer to cope with the aforementioned problems. In particular, two modules, namely the Light Adaptive Fusion Module (LAFM) and the Coupled Attention Fusion Module (CAFM), are also designed to efficiently fuse these advantages. Meanwhile, three effective methods named Overlapping Sliding Window (OSW), Testing Time Augmentation (TTA), and Post-Processing (PP) are proposed to remove small objects and holes embedded in the inference stage and restore complete images. The experimental results evaluated on the Barley Remote Sensing Dataset present that the CCTNet outperformed the single CNN or Transformer methods, achieving 72.97% mean Intersection over Union (mIoU) scores. As a consequence, it is believed that the proposed CCTNet can be a competitive method for crop segmentation by remote sensing images.https://www.mdpi.com/2072-4292/14/9/1956semantic segmentationagricultural researchremote sensingdeep learningCNNTransformer
spellingShingle Hong Wang
Xianzhong Chen
Tianxiang Zhang
Zhiyong Xu
Jiangyun Li
CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images
Remote Sensing
semantic segmentation
agricultural research
remote sensing
deep learning
CNN
Transformer
title CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images
title_full CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images
title_fullStr CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images
title_full_unstemmed CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images
title_short CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images
title_sort cctnet coupled cnn and transformer network for crop segmentation of remote sensing images
topic semantic segmentation
agricultural research
remote sensing
deep learning
CNN
Transformer
url https://www.mdpi.com/2072-4292/14/9/1956
work_keys_str_mv AT hongwang cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages
AT xianzhongchen cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages
AT tianxiangzhang cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages
AT zhiyongxu cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages
AT jiangyunli cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages