DNAS: Decoupling Neural Architecture Search for High-Resolution Remote Sensing Image Semantic Segmentation

Deep learning methods, especially deep convolutional neural networks (DCNNs), have been widely used in high-resolution remote sensing image (HRSI) semantic segmentation. In literature, most successful DCNNs are artificially designed through a large number of experiments, which often consume lots of...

Full description

Bibliographic Details
Main Authors: Yu Wang, Yansheng Li, Wei Chen, Yunzhou Li, Bo Dang
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/16/3864
Description
Summary:Deep learning methods, especially deep convolutional neural networks (DCNNs), have been widely used in high-resolution remote sensing image (HRSI) semantic segmentation. In literature, most successful DCNNs are artificially designed through a large number of experiments, which often consume lots of time and depend on rich domain knowledge. Recently, neural architecture search (NAS), as a direction for automatically designing network architectures, has achieved great success in different kinds of computer vision tasks. For HRSI semantic segmentation, NAS faces two major challenges: (1) The task’s high complexity degree, which is caused by the pixel-by-pixel prediction demand in semantic segmentation, leads to a rapid expansion of the search space; (2) HRSI semantic segmentation often needs to exploit long-range dependency (i.e., a large spatial context), which means the NAS technique requires a lot of display memory in the optimization process and can be tough to converge. With the aforementioned considerations in mind, we propose a new decoupling NAS (DNAS) framework to automatically design the network architecture for HRSI semantic segmentation. In DNAS, a hierarchical search space with three levels is recommended: path-level, connection-level, and cell-level. To adapt to this hierarchical search space, we devised a new decoupling search optimization strategy to decrease the memory occupation. More specifically, the search optimization strategy consists of three stages: (1) a light super-net (i.e., the specific search space) in the path-level space is trained to get the optimal path coding; (2) we endowed the optimal path with various cross-layer connections and it is trained to obtain the connection coding; (3) the super-net, which is initialized by path coding and connection coding, is populated with kinds of concrete cell operators and the optimal cell operators are finally determined. It is worth noting that the well-designed search space can cover various network candidates and the optimization process can be done efficiently. Extensive experiments on the publicly open GID and FU datasets showed that our DNAS outperformed the state-of-the-art methods, including artificial networks and NAS methods.
ISSN:2072-4292