Lite‐weight semantic segmentation with AG self‐attention

Abstract Due to the large computational and GPUs memory cost of semantic segmentation, some works focus on designing a lite weight model to achieve a good trade‐off between computational cost and accuracy. A common method is to combined CNN and vision transformer. However, these methods ignore the c...

Full description

Bibliographic Details
Main Authors:	Bing Liu, Yansheng Gao, Hai Li, Zhaohao Zhong, Hongwei Zhao
Format:	Article
Language:	English
Published:	Wiley 2024-02-01
Series:	IET Computer Vision
Subjects:	computational complexity convolutional neural nets
Online Access:	https://doi.org/10.1049/cvi2.12225

_version_	1797320622312783872
author	Bing Liu Yansheng Gao Hai Li Zhaohao Zhong Hongwei Zhao
author_facet	Bing Liu Yansheng Gao Hai Li Zhaohao Zhong Hongwei Zhao
author_sort	Bing Liu
collection	DOAJ
description	Abstract Due to the large computational and GPUs memory cost of semantic segmentation, some works focus on designing a lite weight model to achieve a good trade‐off between computational cost and accuracy. A common method is to combined CNN and vision transformer. However, these methods ignore the contextual information of multi receptive fields. And existing methods often fail to inject detailed information losses in the downsampling of multi‐scale feature. To fix these issues, we propose AG Self‐Attention, which is Enhanced Atrous Self‐Attention (EASA), and Gate Attention. AG Self‐Attention adds the contextual information of multi receptive fields into the global semantic feature. Specifically, the Enhanced Atrous Self‐Attention uses weight shared atrous convolution with different atrous rates to get the contextual information under the specific different receptive fields. Gate Attention introduces gating mechanism to inject detailed information into the global semantic feature and filter detailed information by producing “fusion” gate and “update” gate. In order to prove our insight. We conduct numerous experiments in common semantic segmentation datasets, consisting of ADE20 K, COCO‐stuff, PASCAL Context, Cityscapes, to show that our method achieves state‐of‐the‐art performance and achieve a good trade‐off between computational cost and accuracy.
first_indexed	2024-03-08T04:45:39Z
format	Article
id	doaj.art-43dc2eb0e67240cca983f6a978572ab0
institution	Directory Open Access Journal
issn	1751-9632 1751-9640
language	English
last_indexed	2024-03-08T04:45:39Z
publishDate	2024-02-01
publisher	Wiley
record_format	Article
series	IET Computer Vision
spelling	doaj.art-43dc2eb0e67240cca983f6a978572ab02024-02-08T10:33:59ZengWileyIET Computer Vision1751-96321751-96402024-02-01181728310.1049/cvi2.12225Lite‐weight semantic segmentation with AG self‐attentionBing Liu0Yansheng Gao1Hai Li2Zhaohao Zhong3Hongwei Zhao4College of Computer Science and Technology Jilin University Changchun ChinaCollege of Computer Science and Engineering Changchun University of Technology Changchun ChinaCollege of Computer Science and Technology Jilin University Changchun ChinaCollege of Computer Science and Engineering Changchun University of Technology Changchun ChinaCollege of Computer Science and Technology Jilin University Changchun ChinaAbstract Due to the large computational and GPUs memory cost of semantic segmentation, some works focus on designing a lite weight model to achieve a good trade‐off between computational cost and accuracy. A common method is to combined CNN and vision transformer. However, these methods ignore the contextual information of multi receptive fields. And existing methods often fail to inject detailed information losses in the downsampling of multi‐scale feature. To fix these issues, we propose AG Self‐Attention, which is Enhanced Atrous Self‐Attention (EASA), and Gate Attention. AG Self‐Attention adds the contextual information of multi receptive fields into the global semantic feature. Specifically, the Enhanced Atrous Self‐Attention uses weight shared atrous convolution with different atrous rates to get the contextual information under the specific different receptive fields. Gate Attention introduces gating mechanism to inject detailed information into the global semantic feature and filter detailed information by producing “fusion” gate and “update” gate. In order to prove our insight. We conduct numerous experiments in common semantic segmentation datasets, consisting of ADE20 K, COCO‐stuff, PASCAL Context, Cityscapes, to show that our method achieves state‐of‐the‐art performance and achieve a good trade‐off between computational cost and accuracy.https://doi.org/10.1049/cvi2.12225computational complexityconvolutional neural nets
spellingShingle	Bing Liu Yansheng Gao Hai Li Zhaohao Zhong Hongwei Zhao Lite‐weight semantic segmentation with AG self‐attention IET Computer Vision computational complexity convolutional neural nets
title	Lite‐weight semantic segmentation with AG self‐attention
title_full	Lite‐weight semantic segmentation with AG self‐attention
title_fullStr	Lite‐weight semantic segmentation with AG self‐attention
title_full_unstemmed	Lite‐weight semantic segmentation with AG self‐attention
title_short	Lite‐weight semantic segmentation with AG self‐attention
title_sort	lite weight semantic segmentation with ag self attention
topic	computational complexity convolutional neural nets
url	https://doi.org/10.1049/cvi2.12225
work_keys_str_mv	AT bingliu liteweightsemanticsegmentationwithagselfattention AT yanshenggao liteweightsemanticsegmentationwithagselfattention AT haili liteweightsemanticsegmentationwithagselfattention AT zhaohaozhong liteweightsemanticsegmentationwithagselfattention AT hongweizhao liteweightsemanticsegmentationwithagselfattention

Lite‐weight semantic segmentation with AG self‐attention

Similar Items