Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization

We propose a high-speed convolutional neural network approach for weakly supervised localization (WSL) and weakly supervised object detection (WSOD). The proposed method, called center-of-features localization (COFL), performs localization of objects in a visual scene by combining both multi-label c...

Full description

Bibliographic Details
Main Authors: Hatem Ibrahem, Ahmed Diefy Ahmed Salem, Hyun-Soo Kang
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9371695/
_version_ 1819175694849015808
author Hatem Ibrahem
Ahmed Diefy Ahmed Salem
Hyun-Soo Kang
author_facet Hatem Ibrahem
Ahmed Diefy Ahmed Salem
Hyun-Soo Kang
author_sort Hatem Ibrahem
collection DOAJ
description We propose a high-speed convolutional neural network approach for weakly supervised localization (WSL) and weakly supervised object detection (WSOD). The proposed method, called center-of-features localization (COFL), performs localization of objects in a visual scene by combining both multi-label classification and regression for the number of instances of each class object. A modified Xception network architecture is used as the main feature extractor, and a classification-plus-regression loss function is used to perform the detection task. The method does not require bounding box annotations but only image labels and counts of the objects of each class in the image. This combination can produce a clear localization of objects in the scene through a masking technique between class activation maps (CAMs) and regression activation maps (RAMs). The proposed method was trained and tested on the PASCAL VOC2007 and VOC2012 datasets; it attained a mean average precision (mAP) of 47.0% and a correct localization CorLoc of 64.1% on PASCAL VOC2007 and a mAP of 42.3% and a CorLoc of 65.5% on PASCAL VOC2012 while performing object detection at a speed of ~50 fps. These results demonstrate that the network can perform object detection accurately in real-time using only image labels and object counts, which are inexpensive to annotate compared with the bounding box annotations typically employed in fully supervised object detection methods. The network far outperforms other weakly supervised methods and some fully supervised methods in terms of processing time while achieving fair accuracy.
first_indexed 2024-12-22T20:58:57Z
format Article
id doaj.art-81c3e584fb21432697ba9dc72a7ea9f5
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-22T20:58:57Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-81c3e584fb21432697ba9dc72a7ea9f52022-12-21T18:12:53ZengIEEEIEEE Access2169-35362021-01-019387423875610.1109/ACCESS.2021.30643729371695Real-Time Weakly Supervised Object Detection Using Center-of-Features LocalizationHatem Ibrahem0https://orcid.org/0000-0001-8722-3300Ahmed Diefy Ahmed Salem1https://orcid.org/0000-0002-4682-0368Hyun-Soo Kang2https://orcid.org/0000-0002-4333-2852School of Information and Communication Engineering, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, South KoreaSchool of Information and Communication Engineering, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, South KoreaSchool of Information and Communication Engineering, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, South KoreaWe propose a high-speed convolutional neural network approach for weakly supervised localization (WSL) and weakly supervised object detection (WSOD). The proposed method, called center-of-features localization (COFL), performs localization of objects in a visual scene by combining both multi-label classification and regression for the number of instances of each class object. A modified Xception network architecture is used as the main feature extractor, and a classification-plus-regression loss function is used to perform the detection task. The method does not require bounding box annotations but only image labels and counts of the objects of each class in the image. This combination can produce a clear localization of objects in the scene through a masking technique between class activation maps (CAMs) and regression activation maps (RAMs). The proposed method was trained and tested on the PASCAL VOC2007 and VOC2012 datasets; it attained a mean average precision (mAP) of 47.0% and a correct localization CorLoc of 64.1% on PASCAL VOC2007 and a mAP of 42.3% and a CorLoc of 65.5% on PASCAL VOC2012 while performing object detection at a speed of ~50 fps. These results demonstrate that the network can perform object detection accurately in real-time using only image labels and object counts, which are inexpensive to annotate compared with the bounding box annotations typically employed in fully supervised object detection methods. The network far outperforms other weakly supervised methods and some fully supervised methods in terms of processing time while achieving fair accuracy.https://ieeexplore.ieee.org/document/9371695/Object detectionobject localizationweakly supervised learningconvolutional neural networks
spellingShingle Hatem Ibrahem
Ahmed Diefy Ahmed Salem
Hyun-Soo Kang
Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization
IEEE Access
Object detection
object localization
weakly supervised learning
convolutional neural networks
title Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization
title_full Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization
title_fullStr Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization
title_full_unstemmed Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization
title_short Real-Time Weakly Supervised Object Detection Using Center-of-Features Localization
title_sort real time weakly supervised object detection using center of features localization
topic Object detection
object localization
weakly supervised learning
convolutional neural networks
url https://ieeexplore.ieee.org/document/9371695/
work_keys_str_mv AT hatemibrahem realtimeweaklysupervisedobjectdetectionusingcenteroffeatureslocalization
AT ahmeddiefyahmedsalem realtimeweaklysupervisedobjectdetectionusingcenteroffeatureslocalization
AT hyunsookang realtimeweaklysupervisedobjectdetectionusingcenteroffeatureslocalization