AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion

With the advent of autonomous vehicle applications, the importance of LiDAR point cloud 3D object detection cannot be overstated. Recent studies have demonstrated that methods for aggregating features from voxels can accurately and efficiently detect objects in large, complex 3D detection scenes. Ne...

Full description

Bibliographic Details
Main Authors: Guangping Li, Zuanfang Mo, Bingo Wing-Kuen Ling
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/23/9319
_version_ 1827592048783392768
author Guangping Li
Zuanfang Mo
Bingo Wing-Kuen Ling
author_facet Guangping Li
Zuanfang Mo
Bingo Wing-Kuen Ling
author_sort Guangping Li
collection DOAJ
description With the advent of autonomous vehicle applications, the importance of LiDAR point cloud 3D object detection cannot be overstated. Recent studies have demonstrated that methods for aggregating features from voxels can accurately and efficiently detect objects in large, complex 3D detection scenes. Nevertheless, most of these methods do not filter background points well and have inferior detection performance for small objects. To ameliorate this issue, this paper proposes an Attention-based and Multiscale Feature Fusion Network (AMFF-Net), which utilizes a Dual-Attention Voxel Feature Extractor (DA-VFE) and a Multi-scale Feature Fusion (MFF) Module to improve the precision and efficiency of 3D object detection. The DA-VFE considers pointwise and channelwise attention and integrates them into the Voxel Feature Extractor (VFE) to enhance key point cloud information in voxels and refine more-representative voxel features. The MFF Module consists of self-calibrated convolutions, a residual structure, and a coordinate attention mechanism, which acts as a 2D Backbone to expand the receptive domain and capture more contextual information, thus better capturing small object locations, enhancing the feature-extraction capability of the network and reducing the computational overhead. We performed evaluations of the proposed model on the nuScenes dataset with a large number of driving scenarios. The experimental results showed that the AMFF-Net achieved 62.8% in the mAP, which significantly boosted the performance of small object detection compared to the baseline network and significantly reduced the computational overhead, while the inference speed remained essentially the same. AMFF-Net also achieved advanced performance on the KITTI dataset.
first_indexed 2024-03-09T01:43:13Z
format Article
id doaj.art-97e60d445acf49d5adf6080ad62f8bfa
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T01:43:13Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-97e60d445acf49d5adf6080ad62f8bfa2023-12-08T15:25:24ZengMDPI AGSensors1424-82202023-11-012323931910.3390/s23239319AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature FusionGuangping Li0Zuanfang Mo1Bingo Wing-Kuen Ling2School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, ChinaSchool of Information Engineering, Guangdong University of Technology, Guangzhou 510006, ChinaSchool of Information Engineering, Guangdong University of Technology, Guangzhou 510006, ChinaWith the advent of autonomous vehicle applications, the importance of LiDAR point cloud 3D object detection cannot be overstated. Recent studies have demonstrated that methods for aggregating features from voxels can accurately and efficiently detect objects in large, complex 3D detection scenes. Nevertheless, most of these methods do not filter background points well and have inferior detection performance for small objects. To ameliorate this issue, this paper proposes an Attention-based and Multiscale Feature Fusion Network (AMFF-Net), which utilizes a Dual-Attention Voxel Feature Extractor (DA-VFE) and a Multi-scale Feature Fusion (MFF) Module to improve the precision and efficiency of 3D object detection. The DA-VFE considers pointwise and channelwise attention and integrates them into the Voxel Feature Extractor (VFE) to enhance key point cloud information in voxels and refine more-representative voxel features. The MFF Module consists of self-calibrated convolutions, a residual structure, and a coordinate attention mechanism, which acts as a 2D Backbone to expand the receptive domain and capture more contextual information, thus better capturing small object locations, enhancing the feature-extraction capability of the network and reducing the computational overhead. We performed evaluations of the proposed model on the nuScenes dataset with a large number of driving scenarios. The experimental results showed that the AMFF-Net achieved 62.8% in the mAP, which significantly boosted the performance of small object detection compared to the baseline network and significantly reduced the computational overhead, while the inference speed remained essentially the same. AMFF-Net also achieved advanced performance on the KITTI dataset.https://www.mdpi.com/1424-8220/23/23/93193D object detectionLiDARpoint cloudmulti-scale feature fusionattention mechanism
spellingShingle Guangping Li
Zuanfang Mo
Bingo Wing-Kuen Ling
AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion
Sensors
3D object detection
LiDAR
point cloud
multi-scale feature fusion
attention mechanism
title AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion
title_full AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion
title_fullStr AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion
title_full_unstemmed AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion
title_short AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion
title_sort amff net an effective 3d object detector based on attention and multi scale feature fusion
topic 3D object detection
LiDAR
point cloud
multi-scale feature fusion
attention mechanism
url https://www.mdpi.com/1424-8220/23/23/9319
work_keys_str_mv AT guangpingli amffnetaneffective3dobjectdetectorbasedonattentionandmultiscalefeaturefusion
AT zuanfangmo amffnetaneffective3dobjectdetectorbasedonattentionandmultiscalefeaturefusion
AT bingowingkuenling amffnetaneffective3dobjectdetectorbasedonattentionandmultiscalefeaturefusion