Summary: | In recent years, the development of smart transportation has accelerated research on semantic segmentation as it is one of the most important problems in this area. A large receptive field has always been the center of focus when designing convolutional neural networks for semantic segmentation. A majority of recent techniques have used maxpooling to increase the receptive field of a network at an expense of decreasing its spatial resolution. Although this idea has shown improved results in object detection applications, however, when it comes to semantic segmentation, a high spatial resolution also needs to be considered. To address this issue, a new deep learning model, the M-Net is proposed in this paper which satisfies both high spatial resolution and a large enough receptive field while keeping the size of the model to a minimum. The proposed network is based on an encoder-decoder architecture. The encoder uses atrous convolution to encode the features at full resolution, and instead of using heavy transposed convolution, the decoder consists of a multipath feature extraction module that can extract multiscale context information from the encoded features. The experimental results reported in the paper demonstrate the viability of the proposed scheme.
|