MU-Net: Embedding MixFormer into Unet to Extract Water Bodies from Remote Sensing Images
Water bodies extraction is important in water resource utilization and flood prevention and mitigation. Remote sensing images contain rich information, but due to the complex spatial background features and noise interference, problems such as inaccurate tributary extraction and inaccurate segmentat...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-07-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/15/14/3559 |
_version_ | 1797587625001877504 |
---|---|
author | Yonghong Zhang Huanyu Lu Guangyi Ma Huajun Zhao Donglin Xie Sutong Geng Wei Tian Kenny Thiam Choy Lim Kam Sian |
author_facet | Yonghong Zhang Huanyu Lu Guangyi Ma Huajun Zhao Donglin Xie Sutong Geng Wei Tian Kenny Thiam Choy Lim Kam Sian |
author_sort | Yonghong Zhang |
collection | DOAJ |
description | Water bodies extraction is important in water resource utilization and flood prevention and mitigation. Remote sensing images contain rich information, but due to the complex spatial background features and noise interference, problems such as inaccurate tributary extraction and inaccurate segmentation occur when extracting water bodies. Recently, using a convolutional neural network (CNN) to extract water bodies is gradually becoming popular. However, the local property of CNN limits the extraction of global information, while Transformer, using a self-attention mechanism, has great potential in modeling global information. This paper proposes the MU-Net, a hybrid MixFormer architecture, as a novel method for automatically extracting water bodies. First, the MixFormer block is embedded into Unet. The combination of CNN and MixFormer is used to model the local spatial detail information and global contextual information of the image to improve the ability of the network to capture semantic features of the water body. Then, the features generated by the encoder are refined by the attention mechanism module to suppress the interference of image background noise and non-water body features, which further improves the accuracy of water body extraction. The experiments show that our method has higher segmentation accuracy and robust performance compared with the mainstream CNN- and Transformer-based semantic segmentation networks. The proposed MU-Net achieves 90.25% and 76.52% IoU on the GID and LoveDA datasets, respectively. The experimental results also validate the potential of MixFormer in water extraction studies. |
first_indexed | 2024-03-11T00:41:33Z |
format | Article |
id | doaj.art-f5146bec59af4b3dbfa2cd442dbfcec8 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-11T00:41:33Z |
publishDate | 2023-07-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-f5146bec59af4b3dbfa2cd442dbfcec82023-11-18T21:12:29ZengMDPI AGRemote Sensing2072-42922023-07-011514355910.3390/rs15143559MU-Net: Embedding MixFormer into Unet to Extract Water Bodies from Remote Sensing ImagesYonghong Zhang0Huanyu Lu1Guangyi Ma2Huajun Zhao3Donglin Xie4Sutong Geng5Wei Tian6Kenny Thiam Choy Lim Kam Sian7School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaSchool of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaSchool of Electronics and Information Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaSchool of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaSchool of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaSchool of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaSchool of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaSchool of Atmospheric Science and Remote Sensing, Wuxi University, Wuxi 214105, ChinaWater bodies extraction is important in water resource utilization and flood prevention and mitigation. Remote sensing images contain rich information, but due to the complex spatial background features and noise interference, problems such as inaccurate tributary extraction and inaccurate segmentation occur when extracting water bodies. Recently, using a convolutional neural network (CNN) to extract water bodies is gradually becoming popular. However, the local property of CNN limits the extraction of global information, while Transformer, using a self-attention mechanism, has great potential in modeling global information. This paper proposes the MU-Net, a hybrid MixFormer architecture, as a novel method for automatically extracting water bodies. First, the MixFormer block is embedded into Unet. The combination of CNN and MixFormer is used to model the local spatial detail information and global contextual information of the image to improve the ability of the network to capture semantic features of the water body. Then, the features generated by the encoder are refined by the attention mechanism module to suppress the interference of image background noise and non-water body features, which further improves the accuracy of water body extraction. The experiments show that our method has higher segmentation accuracy and robust performance compared with the mainstream CNN- and Transformer-based semantic segmentation networks. The proposed MU-Net achieves 90.25% and 76.52% IoU on the GID and LoveDA datasets, respectively. The experimental results also validate the potential of MixFormer in water extraction studies.https://www.mdpi.com/2072-4292/15/14/3559attention mechanismconvolutional neural networkMixFormerremote sensingsemantic segmentationTransformer |
spellingShingle | Yonghong Zhang Huanyu Lu Guangyi Ma Huajun Zhao Donglin Xie Sutong Geng Wei Tian Kenny Thiam Choy Lim Kam Sian MU-Net: Embedding MixFormer into Unet to Extract Water Bodies from Remote Sensing Images Remote Sensing attention mechanism convolutional neural network MixFormer remote sensing semantic segmentation Transformer |
title | MU-Net: Embedding MixFormer into Unet to Extract Water Bodies from Remote Sensing Images |
title_full | MU-Net: Embedding MixFormer into Unet to Extract Water Bodies from Remote Sensing Images |
title_fullStr | MU-Net: Embedding MixFormer into Unet to Extract Water Bodies from Remote Sensing Images |
title_full_unstemmed | MU-Net: Embedding MixFormer into Unet to Extract Water Bodies from Remote Sensing Images |
title_short | MU-Net: Embedding MixFormer into Unet to Extract Water Bodies from Remote Sensing Images |
title_sort | mu net embedding mixformer into unet to extract water bodies from remote sensing images |
topic | attention mechanism convolutional neural network MixFormer remote sensing semantic segmentation Transformer |
url | https://www.mdpi.com/2072-4292/15/14/3559 |
work_keys_str_mv | AT yonghongzhang munetembeddingmixformerintounettoextractwaterbodiesfromremotesensingimages AT huanyulu munetembeddingmixformerintounettoextractwaterbodiesfromremotesensingimages AT guangyima munetembeddingmixformerintounettoextractwaterbodiesfromremotesensingimages AT huajunzhao munetembeddingmixformerintounettoextractwaterbodiesfromremotesensingimages AT donglinxie munetembeddingmixformerintounettoextractwaterbodiesfromremotesensingimages AT sutonggeng munetembeddingmixformerintounettoextractwaterbodiesfromremotesensingimages AT weitian munetembeddingmixformerintounettoextractwaterbodiesfromremotesensingimages AT kennythiamchoylimkamsian munetembeddingmixformerintounettoextractwaterbodiesfromremotesensingimages |