Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving

The fusion of camera and LiDAR perception has become a research focal point in the autonomous driving field. Existing image–point cloud fusion algorithms are overly complex, and processing large amounts of 3D LiDAR point cloud data requires high computational power, which poses challenges for practi...

Full description

Bibliographic Details
Main Authors:	Yongsheng Wang, Xiaobo Han, Xiaoxu Wei, Jie Luo
Format:	Article
Language:	English
Published:	MDPI AG 2024-01-01
Series:	Mathematics
Subjects:	3D object detection fusion perception instance segmentation point cloud filter
Online Access:	https://www.mdpi.com/2227-7390/12/1/153

_version_	1827384644433084416
author	Yongsheng Wang Xiaobo Han Xiaoxu Wei Jie Luo
author_facet	Yongsheng Wang Xiaobo Han Xiaoxu Wei Jie Luo
author_sort	Yongsheng Wang
collection	DOAJ
description	The fusion of camera and LiDAR perception has become a research focal point in the autonomous driving field. Existing image–point cloud fusion algorithms are overly complex, and processing large amounts of 3D LiDAR point cloud data requires high computational power, which poses challenges for practical applications. To overcome the above problems, herein, we propose an Instance Segmentation Frustum (ISF)–PointPillars method. Within the framework of our method, input data are derived from both a camera and LiDAR. RGB images are processed using an enhanced 2D object detection network based on YOLOv8, thereby yielding rectangular bounding boxes and edge contours of the objects present within the scenes. Subsequently, the rectangular boxes are extended into 3D space as frustums, and the 3D points located outside them are removed. Afterward, the 2D edge contours are also extended to frustums to filter the remaining points from the preceding stage. Finally, the retained points are sent to our improved 3D object detection network based on PointPillars, and this network infers crucial information, such as object category, scale, and spatial position. In pursuit of a lightweight model, we incorporate attention modules into the 2D detector, thereby refining the focus on essential features, minimizing redundant computations, and enhancing model accuracy and efficiency. Moreover, the point filtering algorithm substantially diminishes the volume of point cloud data while concurrently reducing their dimensionality, thereby ultimately achieving lightweight 3D data. Through comparative experiments on the KITTI dataset, our method outperforms traditional approaches, achieving an average precision (AP) of 88.94% and bird’s-eye view (BEV) accuracy of 90.89% in car detection.
first_indexed	2024-03-08T15:02:03Z
format	Article
id	doaj.art-13d15a0049034a9db889729fe4c4f40c
institution	Directory Open Access Journal
issn	2227-7390
language	English
last_indexed	2024-03-08T15:02:03Z
publishDate	2024-01-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj.art-13d15a0049034a9db889729fe4c4f40c2024-01-10T15:03:46ZengMDPI AGMathematics2227-73902024-01-0112115310.3390/math12010153Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous DrivingYongsheng Wang0Xiaobo Han1Xiaoxu Wei2Jie Luo3School of Information Engineering, Wuhan University of Technology, Wuhan 430070, ChinaSchool of Automation, Wuhan University of Technology, Wuhan 430070, ChinaSchool of Automotive Engineering, Wuhan University of Technology, Wuhan 430070, ChinaSchool of Automation, Wuhan University of Technology, Wuhan 430070, ChinaThe fusion of camera and LiDAR perception has become a research focal point in the autonomous driving field. Existing image–point cloud fusion algorithms are overly complex, and processing large amounts of 3D LiDAR point cloud data requires high computational power, which poses challenges for practical applications. To overcome the above problems, herein, we propose an Instance Segmentation Frustum (ISF)–PointPillars method. Within the framework of our method, input data are derived from both a camera and LiDAR. RGB images are processed using an enhanced 2D object detection network based on YOLOv8, thereby yielding rectangular bounding boxes and edge contours of the objects present within the scenes. Subsequently, the rectangular boxes are extended into 3D space as frustums, and the 3D points located outside them are removed. Afterward, the 2D edge contours are also extended to frustums to filter the remaining points from the preceding stage. Finally, the retained points are sent to our improved 3D object detection network based on PointPillars, and this network infers crucial information, such as object category, scale, and spatial position. In pursuit of a lightweight model, we incorporate attention modules into the 2D detector, thereby refining the focus on essential features, minimizing redundant computations, and enhancing model accuracy and efficiency. Moreover, the point filtering algorithm substantially diminishes the volume of point cloud data while concurrently reducing their dimensionality, thereby ultimately achieving lightweight 3D data. Through comparative experiments on the KITTI dataset, our method outperforms traditional approaches, achieving an average precision (AP) of 88.94% and bird’s-eye view (BEV) accuracy of 90.89% in car detection.https://www.mdpi.com/2227-7390/12/1/1533D object detectionfusion perceptioninstance segmentationpoint cloud filter
spellingShingle	Yongsheng Wang Xiaobo Han Xiaoxu Wei Jie Luo Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving Mathematics 3D object detection fusion perception instance segmentation point cloud filter
title	Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving
title_full	Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving
title_fullStr	Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving
title_full_unstemmed	Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving
title_short	Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving
title_sort	instance segmentation frustum pointpillars a lightweight fusion algorithm for camera lidar perception in autonomous driving
topic	3D object detection fusion perception instance segmentation point cloud filter
url	https://www.mdpi.com/2227-7390/12/1/153
work_keys_str_mv	AT yongshengwang instancesegmentationfrustumpointpillarsalightweightfusionalgorithmforcameralidarperceptioninautonomousdriving AT xiaobohan instancesegmentationfrustumpointpillarsalightweightfusionalgorithmforcameralidarperceptioninautonomousdriving AT xiaoxuwei instancesegmentationfrustumpointpillarsalightweightfusionalgorithmforcameralidarperceptioninautonomousdriving AT jieluo instancesegmentationfrustumpointpillarsalightweightfusionalgorithmforcameralidarperceptioninautonomousdriving

Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving

Similar Items