CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images
Semantic segmentation by using remote sensing images is an efficient method for agricultural crop classification. Recent solutions in crop segmentation are mainly deep-learning-based methods, including two mainstream architectures: Convolutional Neural Networks (CNNs) and Transformer. However, these...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-04-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/14/9/1956 |
_version_ | 1797503116642353152 |
---|---|
author | Hong Wang Xianzhong Chen Tianxiang Zhang Zhiyong Xu Jiangyun Li |
author_facet | Hong Wang Xianzhong Chen Tianxiang Zhang Zhiyong Xu Jiangyun Li |
author_sort | Hong Wang |
collection | DOAJ |
description | Semantic segmentation by using remote sensing images is an efficient method for agricultural crop classification. Recent solutions in crop segmentation are mainly deep-learning-based methods, including two mainstream architectures: Convolutional Neural Networks (CNNs) and Transformer. However, these two architectures are not sufficiently good for the crop segmentation task due to the following three reasons. First, the ultra-high-resolution images need to be cut into small patches before processing, which leads to the incomplete structure of different categories’ edges. Second, because of the deficiency of global information, categories inside the crop field may be wrongly classified. Third, to restore complete images, the patches need to be spliced together, causing the edge artifacts and small misclassified objects and holes. Therefore, we proposed a novel architecture named the Coupled CNN and Transformer Network (CCTNet), which combines the local details (e.g., edge and texture) by the CNN and global context by Transformer to cope with the aforementioned problems. In particular, two modules, namely the Light Adaptive Fusion Module (LAFM) and the Coupled Attention Fusion Module (CAFM), are also designed to efficiently fuse these advantages. Meanwhile, three effective methods named Overlapping Sliding Window (OSW), Testing Time Augmentation (TTA), and Post-Processing (PP) are proposed to remove small objects and holes embedded in the inference stage and restore complete images. The experimental results evaluated on the Barley Remote Sensing Dataset present that the CCTNet outperformed the single CNN or Transformer methods, achieving 72.97% mean Intersection over Union (mIoU) scores. As a consequence, it is believed that the proposed CCTNet can be a competitive method for crop segmentation by remote sensing images. |
first_indexed | 2024-03-10T03:45:55Z |
format | Article |
id | doaj.art-6fc6b8e76fa6448ca466b8b8266d901e |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-10T03:45:55Z |
publishDate | 2022-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-6fc6b8e76fa6448ca466b8b8266d901e2023-11-23T09:08:25ZengMDPI AGRemote Sensing2072-42922022-04-01149195610.3390/rs14091956CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing ImagesHong Wang0Xianzhong Chen1Tianxiang Zhang2Zhiyong Xu3Jiangyun Li4School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSemantic segmentation by using remote sensing images is an efficient method for agricultural crop classification. Recent solutions in crop segmentation are mainly deep-learning-based methods, including two mainstream architectures: Convolutional Neural Networks (CNNs) and Transformer. However, these two architectures are not sufficiently good for the crop segmentation task due to the following three reasons. First, the ultra-high-resolution images need to be cut into small patches before processing, which leads to the incomplete structure of different categories’ edges. Second, because of the deficiency of global information, categories inside the crop field may be wrongly classified. Third, to restore complete images, the patches need to be spliced together, causing the edge artifacts and small misclassified objects and holes. Therefore, we proposed a novel architecture named the Coupled CNN and Transformer Network (CCTNet), which combines the local details (e.g., edge and texture) by the CNN and global context by Transformer to cope with the aforementioned problems. In particular, two modules, namely the Light Adaptive Fusion Module (LAFM) and the Coupled Attention Fusion Module (CAFM), are also designed to efficiently fuse these advantages. Meanwhile, three effective methods named Overlapping Sliding Window (OSW), Testing Time Augmentation (TTA), and Post-Processing (PP) are proposed to remove small objects and holes embedded in the inference stage and restore complete images. The experimental results evaluated on the Barley Remote Sensing Dataset present that the CCTNet outperformed the single CNN or Transformer methods, achieving 72.97% mean Intersection over Union (mIoU) scores. As a consequence, it is believed that the proposed CCTNet can be a competitive method for crop segmentation by remote sensing images.https://www.mdpi.com/2072-4292/14/9/1956semantic segmentationagricultural researchremote sensingdeep learningCNNTransformer |
spellingShingle | Hong Wang Xianzhong Chen Tianxiang Zhang Zhiyong Xu Jiangyun Li CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images Remote Sensing semantic segmentation agricultural research remote sensing deep learning CNN Transformer |
title | CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images |
title_full | CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images |
title_fullStr | CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images |
title_full_unstemmed | CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images |
title_short | CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images |
title_sort | cctnet coupled cnn and transformer network for crop segmentation of remote sensing images |
topic | semantic segmentation agricultural research remote sensing deep learning CNN Transformer |
url | https://www.mdpi.com/2072-4292/14/9/1956 |
work_keys_str_mv | AT hongwang cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages AT xianzhongchen cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages AT tianxiangzhang cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages AT zhiyongxu cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages AT jiangyunli cctnetcoupledcnnandtransformernetworkforcropsegmentationofremotesensingimages |