RTL3D: real‐time LIDAR‐based 3D object detection with sparse CNN

LIDAR (light detection and ranging) based real‐time 3D perception is crucial for applications such as autonomous driving. However, most of the convolutional neural network (CNN) based methods are time‐consuming and computation‐intensive. These drawbacks are mainly attributed to the highly variable d...

Full description

Bibliographic Details
Main Authors:	Lin Yan, Kai Liu, Evgeny Belyaev, Meiyu Duan
Format:	Article
Language:	English
Published:	Wiley 2020-08-01
Series:	IET Computer Vision
Subjects:	real-time LIDAR-based 3D object detection convolutional neural network LIDAR point cloud real-time LIDAR-based 3D detector sparse feature learning network voxelised 3D data
Online Access:	https://doi.org/10.1049/iet-cvi.2019.0508

_version_	1797684736425984000
author	Lin Yan Kai Liu Evgeny Belyaev Meiyu Duan
author_facet	Lin Yan Kai Liu Evgeny Belyaev Meiyu Duan
author_sort	Lin Yan
collection	DOAJ
description	LIDAR (light detection and ranging) based real‐time 3D perception is crucial for applications such as autonomous driving. However, most of the convolutional neural network (CNN) based methods are time‐consuming and computation‐intensive. These drawbacks are mainly attributed to the highly variable density of LIDAR point cloud and the complexity of their pipelines. To find a balance between speed and accuracy for 3D object detection from LIDAR, authors propose RTL3D, a computationally efficient Real‐time LIDAR‐based 3D detector. In RTL3D, an effective voxel‐wise feature representation is utilised to organise unstructured point cloud. By employing a sparse feature learning network (SFLN) on voxelised 3D data, RTL3D exploits the sparsity of point cloud and down‐samples 3D data into 2D. Basing on the generated 2D feature map, an optimised dense detection network (DDN) is applied to regress the oriented bounding box without relying on any predefined anchor boxes. The authors also introduce an incremental data augmentation approach which greatly improves the performance of RTL3D. Empirical experiments on public KITTI benchmark demonstrate that RTL3D achieves a competitive performance with state‐of‐the‐art works on 3D detection task. Owning to the simplicity of its single‐stage and anchor‐free design, RTL3D has a real‐time inference speed of 40 FPS.
first_indexed	2024-03-12T00:34:06Z
format	Article
id	doaj.art-5d88358e014e4136bebf9bdac18c5531
institution	Directory Open Access Journal
issn	1751-9632 1751-9640
language	English
last_indexed	2024-03-12T00:34:06Z
publishDate	2020-08-01
publisher	Wiley
record_format	Article
series	IET Computer Vision
spelling	doaj.art-5d88358e014e4136bebf9bdac18c55312023-09-15T10:06:16ZengWileyIET Computer Vision1751-96321751-96402020-08-0114522423210.1049/iet-cvi.2019.0508RTL3D: real‐time LIDAR‐based 3D object detection with sparse CNNLin Yan0Kai Liu1Evgeny Belyaev2Meiyu Duan3School of Computer Science, Xidian University2 South Taibai RoadXi'anShaanxiPeople's Republic of ChinaSchool of Computer Science, Xidian University2 South Taibai RoadXi'anShaanxiPeople's Republic of ChinaITMO UniversitySt PetersburgRussiaSchool of Computer Science, Xidian University2 South Taibai RoadXi'anShaanxiPeople's Republic of ChinaLIDAR (light detection and ranging) based real‐time 3D perception is crucial for applications such as autonomous driving. However, most of the convolutional neural network (CNN) based methods are time‐consuming and computation‐intensive. These drawbacks are mainly attributed to the highly variable density of LIDAR point cloud and the complexity of their pipelines. To find a balance between speed and accuracy for 3D object detection from LIDAR, authors propose RTL3D, a computationally efficient Real‐time LIDAR‐based 3D detector. In RTL3D, an effective voxel‐wise feature representation is utilised to organise unstructured point cloud. By employing a sparse feature learning network (SFLN) on voxelised 3D data, RTL3D exploits the sparsity of point cloud and down‐samples 3D data into 2D. Basing on the generated 2D feature map, an optimised dense detection network (DDN) is applied to regress the oriented bounding box without relying on any predefined anchor boxes. The authors also introduce an incremental data augmentation approach which greatly improves the performance of RTL3D. Empirical experiments on public KITTI benchmark demonstrate that RTL3D achieves a competitive performance with state‐of‐the‐art works on 3D detection task. Owning to the simplicity of its single‐stage and anchor‐free design, RTL3D has a real‐time inference speed of 40 FPS.https://doi.org/10.1049/iet-cvi.2019.0508real-time LIDAR-based 3D object detectionconvolutional neural networkLIDAR point cloudreal-time LIDAR-based 3D detectorsparse feature learning networkvoxelised 3D data
spellingShingle	Lin Yan Kai Liu Evgeny Belyaev Meiyu Duan RTL3D: real‐time LIDAR‐based 3D object detection with sparse CNN IET Computer Vision real-time LIDAR-based 3D object detection convolutional neural network LIDAR point cloud real-time LIDAR-based 3D detector sparse feature learning network voxelised 3D data
title	RTL3D: real‐time LIDAR‐based 3D object detection with sparse CNN
title_full	RTL3D: real‐time LIDAR‐based 3D object detection with sparse CNN
title_fullStr	RTL3D: real‐time LIDAR‐based 3D object detection with sparse CNN
title_full_unstemmed	RTL3D: real‐time LIDAR‐based 3D object detection with sparse CNN
title_short	RTL3D: real‐time LIDAR‐based 3D object detection with sparse CNN
title_sort	rtl3d real time lidar based 3d object detection with sparse cnn
topic	real-time LIDAR-based 3D object detection convolutional neural network LIDAR point cloud real-time LIDAR-based 3D detector sparse feature learning network voxelised 3D data
url	https://doi.org/10.1049/iet-cvi.2019.0508
work_keys_str_mv	AT linyan rtl3drealtimelidarbased3dobjectdetectionwithsparsecnn AT kailiu rtl3drealtimelidarbased3dobjectdetectionwithsparsecnn AT evgenybelyaev rtl3drealtimelidarbased3dobjectdetectionwithsparsecnn AT meiyuduan rtl3drealtimelidarbased3dobjectdetectionwithsparsecnn

RTL3D: real‐time LIDAR‐based 3D object detection with sparse CNN

Similar Items