An encoder‐decoder framework with dynamic convolution for weakly supervised instance segmentation
Abstract In the systems of industrial robotics and autonomous vehicles, instance segmentation is widely employed. However, manually labelling an object outline is time‐consuming. In order to reduce annotation costs, we present a weakly supervised instance segmentation method in this article. A deepl...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2023-12-01
|
Series: | IET Computer Vision |
Subjects: | |
Online Access: | https://doi.org/10.1049/cvi2.12202 |
_version_ | 1797387987177177088 |
---|---|
author | Liangjun Zhu Li Peng Shuchen Ding Zhongren Liu |
author_facet | Liangjun Zhu Li Peng Shuchen Ding Zhongren Liu |
author_sort | Liangjun Zhu |
collection | DOAJ |
description | Abstract In the systems of industrial robotics and autonomous vehicles, instance segmentation is widely employed. However, manually labelling an object outline is time‐consuming. In order to reduce annotation costs, we present a weakly supervised instance segmentation method in this article. A deeply convolutional network is first used to construct multi‐scale feature maps for each object in the input image. After that, the encoder‐decoder framework with dynamic convolution is utilised to enhance model capacity and efficiency, while avoiding the issues of anchor design, proposal selection, and RoIAlign implementation. In particular, Dynamic Heads are used in the encoder to create dynamic convolution kernels, while Instance Heads are used in the decoder to provide the global feature map. With dynamic convolution, each instance can be segmented independently, reducing interference with other instances and improving segmentation accuracy. Under the supervision of projection loss and pixel point colour pairing loss, the contours of each object are finally outlined. On the PASCAL VOC and MS COCO datasets, the proposed method is competitive with more sophisticated approaches. In the VOC dataset, segmentation performance achieved 37.6% average precision with ResNet‐101 and FPN networks. The extensively visualised results demonstrate the effectiveness of the proposed encoder‐decoder framework with dynamic convolution. |
first_indexed | 2024-03-08T22:33:12Z |
format | Article |
id | doaj.art-39e987ce06ba413b898b8575375656b1 |
institution | Directory Open Access Journal |
issn | 1751-9632 1751-9640 |
language | English |
last_indexed | 2024-03-08T22:33:12Z |
publishDate | 2023-12-01 |
publisher | Wiley |
record_format | Article |
series | IET Computer Vision |
spelling | doaj.art-39e987ce06ba413b898b8575375656b12023-12-17T15:35:00ZengWileyIET Computer Vision1751-96321751-96402023-12-0117888389410.1049/cvi2.12202An encoder‐decoder framework with dynamic convolution for weakly supervised instance segmentationLiangjun Zhu0Li Peng1Shuchen Ding2Zhongren Liu3Engineering Research Center of Internet of Things Applied Technology Jiangnan University Wuxi ChinaEngineering Research Center of Internet of Things Applied Technology Jiangnan University Wuxi ChinaSchool of Electronics and Information Engineering Suzhou University of Science and Technology Suzhou ChinaEngineering Research Center of Internet of Things Applied Technology Jiangnan University Wuxi ChinaAbstract In the systems of industrial robotics and autonomous vehicles, instance segmentation is widely employed. However, manually labelling an object outline is time‐consuming. In order to reduce annotation costs, we present a weakly supervised instance segmentation method in this article. A deeply convolutional network is first used to construct multi‐scale feature maps for each object in the input image. After that, the encoder‐decoder framework with dynamic convolution is utilised to enhance model capacity and efficiency, while avoiding the issues of anchor design, proposal selection, and RoIAlign implementation. In particular, Dynamic Heads are used in the encoder to create dynamic convolution kernels, while Instance Heads are used in the decoder to provide the global feature map. With dynamic convolution, each instance can be segmented independently, reducing interference with other instances and improving segmentation accuracy. Under the supervision of projection loss and pixel point colour pairing loss, the contours of each object are finally outlined. On the PASCAL VOC and MS COCO datasets, the proposed method is competitive with more sophisticated approaches. In the VOC dataset, segmentation performance achieved 37.6% average precision with ResNet‐101 and FPN networks. The extensively visualised results demonstrate the effectiveness of the proposed encoder‐decoder framework with dynamic convolution.https://doi.org/10.1049/cvi2.12202image segmentationobject detection |
spellingShingle | Liangjun Zhu Li Peng Shuchen Ding Zhongren Liu An encoder‐decoder framework with dynamic convolution for weakly supervised instance segmentation IET Computer Vision image segmentation object detection |
title | An encoder‐decoder framework with dynamic convolution for weakly supervised instance segmentation |
title_full | An encoder‐decoder framework with dynamic convolution for weakly supervised instance segmentation |
title_fullStr | An encoder‐decoder framework with dynamic convolution for weakly supervised instance segmentation |
title_full_unstemmed | An encoder‐decoder framework with dynamic convolution for weakly supervised instance segmentation |
title_short | An encoder‐decoder framework with dynamic convolution for weakly supervised instance segmentation |
title_sort | encoder decoder framework with dynamic convolution for weakly supervised instance segmentation |
topic | image segmentation object detection |
url | https://doi.org/10.1049/cvi2.12202 |
work_keys_str_mv | AT liangjunzhu anencoderdecoderframeworkwithdynamicconvolutionforweaklysupervisedinstancesegmentation AT lipeng anencoderdecoderframeworkwithdynamicconvolutionforweaklysupervisedinstancesegmentation AT shuchending anencoderdecoderframeworkwithdynamicconvolutionforweaklysupervisedinstancesegmentation AT zhongrenliu anencoderdecoderframeworkwithdynamicconvolutionforweaklysupervisedinstancesegmentation AT liangjunzhu encoderdecoderframeworkwithdynamicconvolutionforweaklysupervisedinstancesegmentation AT lipeng encoderdecoderframeworkwithdynamicconvolutionforweaklysupervisedinstancesegmentation AT shuchending encoderdecoderframeworkwithdynamicconvolutionforweaklysupervisedinstancesegmentation AT zhongrenliu encoderdecoderframeworkwithdynamicconvolutionforweaklysupervisedinstancesegmentation |