On-Board Small-Scale Object Detection for Unmanned Aerial Vehicles (UAVs)

Object detection is a critical task that becomes difficult when dealing with onboard detection using aerial images and computer vision technique. The main challenges with aerial images are small target sizes, low resolution, occlusion, attitude, and scale variations, which affect the performance of...

Full description

Bibliographic Details
Main Authors: Zubair Saeed, Muhammad Haroon Yousaf, Rehan Ahmed, Sergio A. Velastin, Serestina Viriri
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/7/5/310
Description
Summary:Object detection is a critical task that becomes difficult when dealing with onboard detection using aerial images and computer vision technique. The main challenges with aerial images are small target sizes, low resolution, occlusion, attitude, and scale variations, which affect the performance of many object detectors. The accuracy of the detection and the efficiency of the inference are always trade-offs. We modified the architecture of CenterNet and used different CNN-based backbones of ResNet18, ResNet34, ResNet50, ResNet101, ResNet152, Res2Net50, Res2Net101, DLA-34, and hourglass14. A comparison of the modified CenterNet with nine CNN-based backbones is conducted and validated using three challenging datasets, i.e., VisDrone, Stanford Drone dataset (SSD), and AU-AIR. We also implemented well-known off-the-shelf object detectors, i.e., YoloV1 to YoloV7, SSD-MobileNet-V2, and Faster RCNN. The proposed approach and state-of-the-art object detectors are optimized and then implemented on cross-edge platforms, i.e., NVIDIA Jetson Xavier, NVIDIA Jetson Nano, and Neuro Compute Stick 2 (NCS2). A detailed comparison of performance between edge platforms is provided. Our modified CenterNet combination with hourglass as a backbone achieved 91.62%, 75.61%, and 34.82% mAP using the validation sets of AU-AIR, SSD, and VisDrone datasets, respectively. An FPS of 40.02 was achieved using the ResNet18 backbone. We also compared our approach with the latest cutting-edge research and found promising results for both discrete GPU and edge platforms.
ISSN:2504-446X