Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization
We propose a high-speed convolutional neural network approach for weakly supervised localization (WSL) and weakly supervised object detection (WSOD). The proposed method, called center-of-features localization (COFL), performs localization of objects in a visual scene by combining both multi-label c...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9371695/ |
_version_ | 1819175694849015808 |
---|---|
author | Hatem Ibrahem Ahmed Diefy Ahmed Salem Hyun-Soo Kang |
author_facet | Hatem Ibrahem Ahmed Diefy Ahmed Salem Hyun-Soo Kang |
author_sort | Hatem Ibrahem |
collection | DOAJ |
description | We propose a high-speed convolutional neural network approach for weakly supervised localization (WSL) and weakly supervised object detection (WSOD). The proposed method, called center-of-features localization (COFL), performs localization of objects in a visual scene by combining both multi-label classification and regression for the number of instances of each class object. A modified Xception network architecture is used as the main feature extractor, and a classification-plus-regression loss function is used to perform the detection task. The method does not require bounding box annotations but only image labels and counts of the objects of each class in the image. This combination can produce a clear localization of objects in the scene through a masking technique between class activation maps (CAMs) and regression activation maps (RAMs). The proposed method was trained and tested on the PASCAL VOC2007 and VOC2012 datasets; it attained a mean average precision (mAP) of 47.0% and a correct localization CorLoc of 64.1% on PASCAL VOC2007 and a mAP of 42.3% and a CorLoc of 65.5% on PASCAL VOC2012 while performing object detection at a speed of ~50 fps. These results demonstrate that the network can perform object detection accurately in real-time using only image labels and object counts, which are inexpensive to annotate compared with the bounding box annotations typically employed in fully supervised object detection methods. The network far outperforms other weakly supervised methods and some fully supervised methods in terms of processing time while achieving fair accuracy. |
first_indexed | 2024-12-22T20:58:57Z |
format | Article |
id | doaj.art-81c3e584fb21432697ba9dc72a7ea9f5 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-22T20:58:57Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-81c3e584fb21432697ba9dc72a7ea9f52022-12-21T18:12:53ZengIEEEIEEE Access2169-35362021-01-019387423875610.1109/ACCESS.2021.30643729371695Real-Time Weakly Supervised Object Detection Using Center-of-Features LocalizationHatem Ibrahem0https://orcid.org/0000-0001-8722-3300Ahmed Diefy Ahmed Salem1https://orcid.org/0000-0002-4682-0368Hyun-Soo Kang2https://orcid.org/0000-0002-4333-2852School of Information and Communication Engineering, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, South KoreaSchool of Information and Communication Engineering, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, South KoreaSchool of Information and Communication Engineering, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, South KoreaWe propose a high-speed convolutional neural network approach for weakly supervised localization (WSL) and weakly supervised object detection (WSOD). The proposed method, called center-of-features localization (COFL), performs localization of objects in a visual scene by combining both multi-label classification and regression for the number of instances of each class object. A modified Xception network architecture is used as the main feature extractor, and a classification-plus-regression loss function is used to perform the detection task. The method does not require bounding box annotations but only image labels and counts of the objects of each class in the image. This combination can produce a clear localization of objects in the scene through a masking technique between class activation maps (CAMs) and regression activation maps (RAMs). The proposed method was trained and tested on the PASCAL VOC2007 and VOC2012 datasets; it attained a mean average precision (mAP) of 47.0% and a correct localization CorLoc of 64.1% on PASCAL VOC2007 and a mAP of 42.3% and a CorLoc of 65.5% on PASCAL VOC2012 while performing object detection at a speed of ~50 fps. These results demonstrate that the network can perform object detection accurately in real-time using only image labels and object counts, which are inexpensive to annotate compared with the bounding box annotations typically employed in fully supervised object detection methods. The network far outperforms other weakly supervised methods and some fully supervised methods in terms of processing time while achieving fair accuracy.https://ieeexplore.ieee.org/document/9371695/Object detectionobject localizationweakly supervised learningconvolutional neural networks |
spellingShingle | Hatem Ibrahem Ahmed Diefy Ahmed Salem Hyun-Soo Kang Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization IEEE Access Object detection object localization weakly supervised learning convolutional neural networks |
title | Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization |
title_full | Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization |
title_fullStr | Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization |
title_full_unstemmed | Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization |
title_short | Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization |
title_sort | real time weakly supervised object detection using center of features localization |
topic | Object detection object localization weakly supervised learning convolutional neural networks |
url | https://ieeexplore.ieee.org/document/9371695/ |
work_keys_str_mv | AT hatemibrahem realtimeweaklysupervisedobjectdetectionusingcenteroffeatureslocalization AT ahmeddiefyahmedsalem realtimeweaklysupervisedobjectdetectionusingcenteroffeatureslocalization AT hyunsookang realtimeweaklysupervisedobjectdetectionusingcenteroffeatureslocalization |