Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection

Continuous frames of point-cloud-based object detection is a new research direction. Currently, most research studies fuse multi-frame point clouds using concatenation-based methods. The method aligns different frames by using information on GPS, IMU, etc. However, this fusion method can only align...

Full description

Bibliographic Details
Main Authors:	Zhenyu Zhai, Qiantong Wang, Zongxu Pan, Zhentong Gao, Wenlong Hu
Format:	Article
Language:	English
Published:	MDPI AG 2022-10-01
Series:	Sensors
Subjects:	autonomous driving 3D object detection point cloud sequences attention mechanism feature fusion
Online Access:	https://www.mdpi.com/1424-8220/22/19/7473

_version_	1827653007761735680
author	Zhenyu Zhai Qiantong Wang Zongxu Pan Zhentong Gao Wenlong Hu
author_facet	Zhenyu Zhai Qiantong Wang Zongxu Pan Zhentong Gao Wenlong Hu
author_sort	Zhenyu Zhai
collection	DOAJ
description	Continuous frames of point-cloud-based object detection is a new research direction. Currently, most research studies fuse multi-frame point clouds using concatenation-based methods. The method aligns different frames by using information on GPS, IMU, etc. However, this fusion method can only align static objects and not moving objects. In this paper, we proposed a non-local-based multi-scale feature fusion method, which can handle both moving and static objects without GPS- and IMU-based registrations. Considering that non-local methods are resource-consuming, we proposed a novel simplified non-local block based on the sparsity of the point cloud. By filtering out empty units, memory consumption decreased by 99.93%. In addition, triple attention is adopted to enhance the key information on the object and suppresses background noise, further benefiting non-local-based feature fusion methods. Finally, we verify the method based on PointPillars and CenterPoint. Experimental results show that the mAP of the proposed method improved by 3.9% and 4.1% in mAP compared with concatenation-based fusion modules, PointPillars-2 and CenterPoint-2, respectively. In addition, the proposed network outperforms powerful 3D-VID by 1.2% in mAP.
first_indexed	2024-03-09T21:10:04Z
format	Article
id	doaj.art-5b55a1f60c7a4ec9aacd60adcf87a8ca
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-09T21:10:04Z
publishDate	2022-10-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-5b55a1f60c7a4ec9aacd60adcf87a8ca2023-11-23T21:49:45ZengMDPI AGSensors1424-82202022-10-012219747310.3390/s22197473Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object DetectionZhenyu Zhai0Qiantong Wang1Zongxu Pan2Zhentong Gao3Wenlong Hu4Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, ChinaContinuous frames of point-cloud-based object detection is a new research direction. Currently, most research studies fuse multi-frame point clouds using concatenation-based methods. The method aligns different frames by using information on GPS, IMU, etc. However, this fusion method can only align static objects and not moving objects. In this paper, we proposed a non-local-based multi-scale feature fusion method, which can handle both moving and static objects without GPS- and IMU-based registrations. Considering that non-local methods are resource-consuming, we proposed a novel simplified non-local block based on the sparsity of the point cloud. By filtering out empty units, memory consumption decreased by 99.93%. In addition, triple attention is adopted to enhance the key information on the object and suppresses background noise, further benefiting non-local-based feature fusion methods. Finally, we verify the method based on PointPillars and CenterPoint. Experimental results show that the mAP of the proposed method improved by 3.9% and 4.1% in mAP compared with concatenation-based fusion modules, PointPillars-2 and CenterPoint-2, respectively. In addition, the proposed network outperforms powerful 3D-VID by 1.2% in mAP.https://www.mdpi.com/1424-8220/22/19/7473autonomous driving3D object detectionpoint cloud sequencesattention mechanismfeature fusion
spellingShingle	Zhenyu Zhai Qiantong Wang Zongxu Pan Zhentong Gao Wenlong Hu Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection Sensors autonomous driving 3D object detection point cloud sequences attention mechanism feature fusion
title	Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection
title_full	Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection
title_fullStr	Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection
title_full_unstemmed	Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection
title_short	Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection
title_sort	muti frame point cloud feature fusion based on attention mechanisms for 3d object detection
topic	autonomous driving 3D object detection point cloud sequences attention mechanism feature fusion
url	https://www.mdpi.com/1424-8220/22/19/7473
work_keys_str_mv	AT zhenyuzhai mutiframepointcloudfeaturefusionbasedonattentionmechanismsfor3dobjectdetection AT qiantongwang mutiframepointcloudfeaturefusionbasedonattentionmechanismsfor3dobjectdetection AT zongxupan mutiframepointcloudfeaturefusionbasedonattentionmechanismsfor3dobjectdetection AT zhentonggao mutiframepointcloudfeaturefusionbasedonattentionmechanismsfor3dobjectdetection AT wenlonghu mutiframepointcloudfeaturefusionbasedonattentionmechanismsfor3dobjectdetection

Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection

Similar Items