Lite‐weight semantic segmentation with AG self‐attention
Abstract Due to the large computational and GPUs memory cost of semantic segmentation, some works focus on designing a lite weight model to achieve a good trade‐off between computational cost and accuracy. A common method is to combined CNN and vision transformer. However, these methods ignore the c...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2024-02-01
|
Series: | IET Computer Vision |
Subjects: | |
Online Access: | https://doi.org/10.1049/cvi2.12225 |
_version_ | 1797320622312783872 |
---|---|
author | Bing Liu Yansheng Gao Hai Li Zhaohao Zhong Hongwei Zhao |
author_facet | Bing Liu Yansheng Gao Hai Li Zhaohao Zhong Hongwei Zhao |
author_sort | Bing Liu |
collection | DOAJ |
description | Abstract Due to the large computational and GPUs memory cost of semantic segmentation, some works focus on designing a lite weight model to achieve a good trade‐off between computational cost and accuracy. A common method is to combined CNN and vision transformer. However, these methods ignore the contextual information of multi receptive fields. And existing methods often fail to inject detailed information losses in the downsampling of multi‐scale feature. To fix these issues, we propose AG Self‐Attention, which is Enhanced Atrous Self‐Attention (EASA), and Gate Attention. AG Self‐Attention adds the contextual information of multi receptive fields into the global semantic feature. Specifically, the Enhanced Atrous Self‐Attention uses weight shared atrous convolution with different atrous rates to get the contextual information under the specific different receptive fields. Gate Attention introduces gating mechanism to inject detailed information into the global semantic feature and filter detailed information by producing “fusion” gate and “update” gate. In order to prove our insight. We conduct numerous experiments in common semantic segmentation datasets, consisting of ADE20 K, COCO‐stuff, PASCAL Context, Cityscapes, to show that our method achieves state‐of‐the‐art performance and achieve a good trade‐off between computational cost and accuracy. |
first_indexed | 2024-03-08T04:45:39Z |
format | Article |
id | doaj.art-43dc2eb0e67240cca983f6a978572ab0 |
institution | Directory Open Access Journal |
issn | 1751-9632 1751-9640 |
language | English |
last_indexed | 2024-03-08T04:45:39Z |
publishDate | 2024-02-01 |
publisher | Wiley |
record_format | Article |
series | IET Computer Vision |
spelling | doaj.art-43dc2eb0e67240cca983f6a978572ab02024-02-08T10:33:59ZengWileyIET Computer Vision1751-96321751-96402024-02-01181728310.1049/cvi2.12225Lite‐weight semantic segmentation with AG self‐attentionBing Liu0Yansheng Gao1Hai Li2Zhaohao Zhong3Hongwei Zhao4College of Computer Science and Technology Jilin University Changchun ChinaCollege of Computer Science and Engineering Changchun University of Technology Changchun ChinaCollege of Computer Science and Technology Jilin University Changchun ChinaCollege of Computer Science and Engineering Changchun University of Technology Changchun ChinaCollege of Computer Science and Technology Jilin University Changchun ChinaAbstract Due to the large computational and GPUs memory cost of semantic segmentation, some works focus on designing a lite weight model to achieve a good trade‐off between computational cost and accuracy. A common method is to combined CNN and vision transformer. However, these methods ignore the contextual information of multi receptive fields. And existing methods often fail to inject detailed information losses in the downsampling of multi‐scale feature. To fix these issues, we propose AG Self‐Attention, which is Enhanced Atrous Self‐Attention (EASA), and Gate Attention. AG Self‐Attention adds the contextual information of multi receptive fields into the global semantic feature. Specifically, the Enhanced Atrous Self‐Attention uses weight shared atrous convolution with different atrous rates to get the contextual information under the specific different receptive fields. Gate Attention introduces gating mechanism to inject detailed information into the global semantic feature and filter detailed information by producing “fusion” gate and “update” gate. In order to prove our insight. We conduct numerous experiments in common semantic segmentation datasets, consisting of ADE20 K, COCO‐stuff, PASCAL Context, Cityscapes, to show that our method achieves state‐of‐the‐art performance and achieve a good trade‐off between computational cost and accuracy.https://doi.org/10.1049/cvi2.12225computational complexityconvolutional neural nets |
spellingShingle | Bing Liu Yansheng Gao Hai Li Zhaohao Zhong Hongwei Zhao Lite‐weight semantic segmentation with AG self‐attention IET Computer Vision computational complexity convolutional neural nets |
title | Lite‐weight semantic segmentation with AG self‐attention |
title_full | Lite‐weight semantic segmentation with AG self‐attention |
title_fullStr | Lite‐weight semantic segmentation with AG self‐attention |
title_full_unstemmed | Lite‐weight semantic segmentation with AG self‐attention |
title_short | Lite‐weight semantic segmentation with AG self‐attention |
title_sort | lite weight semantic segmentation with ag self attention |
topic | computational complexity convolutional neural nets |
url | https://doi.org/10.1049/cvi2.12225 |
work_keys_str_mv | AT bingliu liteweightsemanticsegmentationwithagselfattention AT yanshenggao liteweightsemanticsegmentationwithagselfattention AT haili liteweightsemanticsegmentationwithagselfattention AT zhaohaozhong liteweightsemanticsegmentationwithagselfattention AT hongweizhao liteweightsemanticsegmentationwithagselfattention |