DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation
In semantic segmentation, an input image is partitioned into multiple meaningful segments each corresponding to a specific object or region. Multi-scale context plays a vital role in the accurate recognition of objects of different sizes and hence is key to overall accuracy enhancement. To achieve t...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10415438/ |
_version_ | 1797323909088935936 |
---|---|
author | Nadeem Atif Saquib Mazhar Shaik Rafi Ahamed M. K. Bhuyan Sultan Alfarhood Mejdl Safran |
author_facet | Nadeem Atif Saquib Mazhar Shaik Rafi Ahamed M. K. Bhuyan Sultan Alfarhood Mejdl Safran |
author_sort | Nadeem Atif |
collection | DOAJ |
description | In semantic segmentation, an input image is partitioned into multiple meaningful segments each corresponding to a specific object or region. Multi-scale context plays a vital role in the accurate recognition of objects of different sizes and hence is key to overall accuracy enhancement. To achieve this goal, we introduce a novel strategy called Distributed Multi-scale Pyramid Pooling (DMPP) to extract multi-scale context at multiple levels of feature hierarchy. More specifically, we employ Pyramid Pooling Modules (PPM) in a distributed fashion after all three stages during the encoding phase. This enhances the feature representation capability of the network and leads to better performance. To extract context at a more granular level, we propose an Efficient Multi-scale Context Aggregation (EMCA) module which uses a combination of small and large kernels with large and small dilation rates, respectively. This alleviates the problem of sparse sampling and leads to consistent recognition of different regions. Apart from model accuracy, small model size and efficient execution are critically important for real-time mobile applications. To achieve it, we employ a resource-friendly combination of depthwise and factorized convolutions in the EMCA module to drastically reduce the number of parameters without significantly compromising the accuracy. Based on the EMCA module and DMPP, we propose a lightweight and real-time Distributed Multi-scale Pyramid Network (DMPNet) that achieves an excellent accuracy-efficiency trade-off. We also conducted extensive experiments on both driving datasets (i.e., Cityscapes and CamVid) and a general-purpose dataset (i.e., ADE20K) to show the effectiveness of the proposed method. |
first_indexed | 2024-03-08T05:34:54Z |
format | Article |
id | doaj.art-4f47458f4b254a07b9c3330bb2d00541 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-08T05:34:54Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-4f47458f4b254a07b9c3330bb2d005412024-02-06T00:01:05ZengIEEEIEEE Access2169-35362024-01-0112165731658510.1109/ACCESS.2024.335942510415438DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic SegmentationNadeem Atif0https://orcid.org/0000-0003-1551-1885Saquib Mazhar1https://orcid.org/0000-0002-3210-2648Shaik Rafi Ahamed2https://orcid.org/0000-0003-1617-2299M. K. Bhuyan3https://orcid.org/0000-0003-2152-5466Sultan Alfarhood4https://orcid.org/0009-0001-1268-9613Mejdl Safran5https://orcid.org/0000-0002-7445-7121Indian Institute of Technology Guwahati, Assam, Guwahati, IndiaIndian Institute of Technology Guwahati, Assam, Guwahati, IndiaIndian Institute of Technology Guwahati, Assam, Guwahati, IndiaIndian Institute of Technology Guwahati, Assam, Guwahati, IndiaDepartment of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaDepartment of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaIn semantic segmentation, an input image is partitioned into multiple meaningful segments each corresponding to a specific object or region. Multi-scale context plays a vital role in the accurate recognition of objects of different sizes and hence is key to overall accuracy enhancement. To achieve this goal, we introduce a novel strategy called Distributed Multi-scale Pyramid Pooling (DMPP) to extract multi-scale context at multiple levels of feature hierarchy. More specifically, we employ Pyramid Pooling Modules (PPM) in a distributed fashion after all three stages during the encoding phase. This enhances the feature representation capability of the network and leads to better performance. To extract context at a more granular level, we propose an Efficient Multi-scale Context Aggregation (EMCA) module which uses a combination of small and large kernels with large and small dilation rates, respectively. This alleviates the problem of sparse sampling and leads to consistent recognition of different regions. Apart from model accuracy, small model size and efficient execution are critically important for real-time mobile applications. To achieve it, we employ a resource-friendly combination of depthwise and factorized convolutions in the EMCA module to drastically reduce the number of parameters without significantly compromising the accuracy. Based on the EMCA module and DMPP, we propose a lightweight and real-time Distributed Multi-scale Pyramid Network (DMPNet) that achieves an excellent accuracy-efficiency trade-off. We also conducted extensive experiments on both driving datasets (i.e., Cityscapes and CamVid) and a general-purpose dataset (i.e., ADE20K) to show the effectiveness of the proposed method.https://ieeexplore.ieee.org/document/10415438/Semantic segmentationdeep learningreal-time processingautonomous drivingresource-constrained |
spellingShingle | Nadeem Atif Saquib Mazhar Shaik Rafi Ahamed M. K. Bhuyan Sultan Alfarhood Mejdl Safran DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation IEEE Access Semantic segmentation deep learning real-time processing autonomous driving resource-constrained |
title | DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation |
title_full | DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation |
title_fullStr | DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation |
title_full_unstemmed | DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation |
title_short | DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation |
title_sort | dmpnet distributed multi scale pyramid network for real time semantic segmentation |
topic | Semantic segmentation deep learning real-time processing autonomous driving resource-constrained |
url | https://ieeexplore.ieee.org/document/10415438/ |
work_keys_str_mv | AT nadeematif dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation AT saquibmazhar dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation AT shaikrafiahamed dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation AT mkbhuyan dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation AT sultanalfarhood dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation AT mejdlsafran dmpnetdistributedmultiscalepyramidnetworkforrealtimesemanticsegmentation |