DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation

In semantic segmentation, an input image is partitioned into multiple meaningful segments each corresponding to a specific object or region. Multi-scale context plays a vital role in the accurate recognition of objects of different sizes and hence is key to overall accuracy enhancement. To achieve t...

Full description

Bibliographic Details
Main Authors: Nadeem Atif, Saquib Mazhar, Shaik Rafi Ahamed, M. K. Bhuyan, Sultan Alfarhood, Mejdl Safran
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10415438/
_version_ 1797323909088935936
author Nadeem Atif
Saquib Mazhar
Shaik Rafi Ahamed
M. K. Bhuyan
Sultan Alfarhood
Mejdl Safran
author_facet Nadeem Atif
Saquib Mazhar
Shaik Rafi Ahamed
M. K. Bhuyan
Sultan Alfarhood
Mejdl Safran
author_sort Nadeem Atif
collection DOAJ
description In semantic segmentation, an input image is partitioned into multiple meaningful segments each corresponding to a specific object or region. Multi-scale context plays a vital role in the accurate recognition of objects of different sizes and hence is key to overall accuracy enhancement. To achieve this goal, we introduce a novel strategy called Distributed Multi-scale Pyramid Pooling (DMPP) to extract multi-scale context at multiple levels of feature hierarchy. More specifically, we employ Pyramid Pooling Modules (PPM) in a distributed fashion after all three stages during the encoding phase. This enhances the feature representation capability of the network and leads to better performance. To extract context at a more granular level, we propose an Efficient Multi-scale Context Aggregation (EMCA) module which uses a combination of small and large kernels with large and small dilation rates, respectively. This alleviates the problem of sparse sampling and leads to consistent recognition of different regions. Apart from model accuracy, small model size and efficient execution are critically important for real-time mobile applications. To achieve it, we employ a resource-friendly combination of depthwise and factorized convolutions in the EMCA module to drastically reduce the number of parameters without significantly compromising the accuracy. Based on the EMCA module and DMPP, we propose a lightweight and real-time Distributed Multi-scale Pyramid Network (DMPNet) that achieves an excellent accuracy-efficiency trade-off. We also conducted extensive experiments on both driving datasets (i.e., Cityscapes and CamVid) and a general-purpose dataset (i.e., ADE20K) to show the effectiveness of the proposed method.
first_indexed 2024-03-08T05:34:54Z
format Article
id doaj.art-4f47458f4b254a07b9c3330bb2d00541
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-08T05:34:54Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-4f47458f4b254a07b9c3330bb2d005412024-02-06T00:01:05ZengIEEEIEEE Access2169-35362024-01-0112165731658510.1109/ACCESS.2024.335942510415438DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic SegmentationNadeem Atif0https://orcid.org/0000-0003-1551-1885Saquib Mazhar1https://orcid.org/0000-0002-3210-2648Shaik Rafi Ahamed2https://orcid.org/0000-0003-1617-2299M. K. Bhuyan3https://orcid.org/0000-0003-2152-5466Sultan Alfarhood4https://orcid.org/0009-0001-1268-9613Mejdl Safran5https://orcid.org/0000-0002-7445-7121Indian Institute of Technology Guwahati, Assam, Guwahati, IndiaIndian Institute of Technology Guwahati, Assam, Guwahati, IndiaIndian Institute of Technology Guwahati, Assam, Guwahati, IndiaIndian Institute of Technology Guwahati, Assam, Guwahati, IndiaDepartment of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaDepartment of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaIn semantic segmentation, an input image is partitioned into multiple meaningful segments each corresponding to a specific object or region. Multi-scale context plays a vital role in the accurate recognition of objects of different sizes and hence is key to overall accuracy enhancement. To achieve this goal, we introduce a novel strategy called Distributed Multi-scale Pyramid Pooling (DMPP) to extract multi-scale context at multiple levels of feature hierarchy. More specifically, we employ Pyramid Pooling Modules (PPM) in a distributed fashion after all three stages during the encoding phase. This enhances the feature representation capability of the network and leads to better performance. To extract context at a more granular level, we propose an Efficient Multi-scale Context Aggregation (EMCA) module which uses a combination of small and large kernels with large and small dilation rates, respectively. This alleviates the problem of sparse sampling and leads to consistent recognition of different regions. Apart from model accuracy, small model size and efficient execution are critically important for real-time mobile applications. To achieve it, we employ a resource-friendly combination of depthwise and factorized convolutions in the EMCA module to drastically reduce the number of parameters without significantly compromising the accuracy. Based on the EMCA module and DMPP, we propose a lightweight and real-time Distributed Multi-scale Pyramid Network (DMPNet) that achieves an excellent accuracy-efficiency trade-off. We also conducted extensive experiments on both driving datasets (i.e., Cityscapes and CamVid) and a general-purpose dataset (i.e., ADE20K) to show the effectiveness of the proposed method.https://ieeexplore.ieee.org/document/10415438/Semantic segmentationdeep learningreal-time processingautonomous drivingresource-constrained
spellingShingle Nadeem Atif
Saquib Mazhar
Shaik Rafi Ahamed
M. K. Bhuyan
Sultan Alfarhood
Mejdl Safran
DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation
IEEE Access
Semantic segmentation
deep learning
real-time processing
autonomous driving
resource-constrained
title DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation
title_full DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation
title_fullStr DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation
title_full_unstemmed DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation
title_short DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation
title_sort dmpnet distributed multi scale pyramid network for real time semantic segmentation
topic Semantic segmentation
deep learning
real-time processing
autonomous driving
resource-constrained
url https://ieeexplore.ieee.org/document/10415438/
work_keys_str_mv AT nadeematif dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation
AT saquibmazhar dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation
AT shaikrafiahamed dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation
AT mkbhuyan dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation
AT sultanalfarhood dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation
AT mejdlsafran dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation