A dual‐balanced network for long‐tail distribution object detection

Abstract Object detection on datasets with imbalanced distributions (i.e. long‐tail distributions) dataset is a significantly challenging task. Some re‐balancing solutions, such as re‐weighting and re‐sampling have two main disadvantages. First, re‐balancing strategies only utilise a coarse‐grained...

Full description

Bibliographic Details
Main Authors: Huiyun Gong, Yeguang Li, Jian Dong
Format: Article
Language:English
Published: Wiley 2023-08-01
Series:IET Computer Vision
Subjects:
Online Access:https://doi.org/10.1049/cvi2.12182
Description
Summary:Abstract Object detection on datasets with imbalanced distributions (i.e. long‐tail distributions) dataset is a significantly challenging task. Some re‐balancing solutions, such as re‐weighting and re‐sampling have two main disadvantages. First, re‐balancing strategies only utilise a coarse‐grained global threshold to suppress some of the most influential categories, while overlooking locally influential categories. Second, very few studies have specifically designed algorithms for object detection tasks under long‐tail distribution. To address these two issues, a dual‐balanced network for fine‐grained re‐balancing object detection is proposed. Our re‐balancing strategies are both in proposal and classification logic, corresponding to two sub‐networks, the Balance Region Proposal Network (BRPN) and the Balance Classification Network (BCN). The BRPN sub‐network equalises the number of proposals in the background and foreground by reducing the sampling probability of simple backgrounds, and the BCN sub‐network equalises the logic between head and tail categories by globally suppressing negative gradients and locally fixing the over‐suppressed negative gradients. In addition, the authors advise a balance binary cross entropy loss to jointly re‐balance the entire network. This design can be generalised to different two‐stage object detection frameworks. The experimental mAP result of 26.40% on this LVIS‐v0.5 dataset outperforms most SOTA methods.
ISSN:1751-9632
1751-9640