Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects

Achieving the accurate perception of occluded objects for autonomous vehicles is a challenging problem. Human vision can always quickly locate important object regions in complex external scenes, while other regions are only roughly analysed or ignored, defined as the visual attention mechanism. How...

Full description

Bibliographic Details
Main Authors:	Xinglei He, Xiaohan Zhang, Yichun Wang, Hongzeng Ji, Xiuhui Duan, Fen Guo
Format:	Article
Language:	English
Published:	MDPI AG 2022-03-01
Series:	Sensors
Subjects:	visual attention mechanism occluded object detection multi-sensor fusion 3D object detection autonomous vehicles
Online Access:	https://www.mdpi.com/1424-8220/22/6/2366

_version_	1797442329687097344
author	Xinglei He Xiaohan Zhang Yichun Wang Hongzeng Ji Xiuhui Duan Fen Guo
author_facet	Xinglei He Xiaohan Zhang Yichun Wang Hongzeng Ji Xiuhui Duan Fen Guo
author_sort	Xinglei He
collection	DOAJ
description	Achieving the accurate perception of occluded objects for autonomous vehicles is a challenging problem. Human vision can always quickly locate important object regions in complex external scenes, while other regions are only roughly analysed or ignored, defined as the visual attention mechanism. However, the perception system of autonomous vehicles cannot know which part of the point cloud is in the region of interest. Therefore, it is meaningful to explore how to use the visual attention mechanism in the perception system of autonomous driving. In this paper, we propose the model of the spatial attention frustum to solve object occlusion in 3D object detection. The spatial attention frustum can suppress unimportant features and allocate limited neural computing resources to critical parts of the scene, thereby providing greater relevance and easier processing for higher-level perceptual reasoning tasks. To ensure that our method maintains good reasoning ability when faced with occluded objects with only a partial structure, we propose a local feature aggregation module to capture more complex local features of the point cloud. Finally, we discuss the projection constraint relationship between the 3D bounding box and the 2D bounding box and propose a joint anchor box projection loss function, which will help to improve the overall performance of our method. The results of the KITTI dataset show that our proposed method can effectively improve the detection accuracy of occluded objects. Our method achieves 89.46%, 79.91% and 75.53% detection accuracy in the easy, moderate, and hard difficulty levels of the car category, and achieves a 6.97% performance improvement especially in the hard category with a high degree of occlusion. Our one-stage method does not need to rely on another refining stage, comparable to the accuracy of the two-stage method.
first_indexed	2024-03-09T12:40:15Z
format	Article
id	doaj.art-dd130f7dbf4d4fab803db2e136980012
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-09T12:40:15Z
publishDate	2022-03-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-dd130f7dbf4d4fab803db2e1369800122023-11-30T22:20:07ZengMDPI AGSensors1424-82202022-03-01226236610.3390/s22062366Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded ObjectsXinglei He0Xiaohan Zhang1Yichun Wang2Hongzeng Ji3Xiuhui Duan4Fen Guo5School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, ChinaAchieving the accurate perception of occluded objects for autonomous vehicles is a challenging problem. Human vision can always quickly locate important object regions in complex external scenes, while other regions are only roughly analysed or ignored, defined as the visual attention mechanism. However, the perception system of autonomous vehicles cannot know which part of the point cloud is in the region of interest. Therefore, it is meaningful to explore how to use the visual attention mechanism in the perception system of autonomous driving. In this paper, we propose the model of the spatial attention frustum to solve object occlusion in 3D object detection. The spatial attention frustum can suppress unimportant features and allocate limited neural computing resources to critical parts of the scene, thereby providing greater relevance and easier processing for higher-level perceptual reasoning tasks. To ensure that our method maintains good reasoning ability when faced with occluded objects with only a partial structure, we propose a local feature aggregation module to capture more complex local features of the point cloud. Finally, we discuss the projection constraint relationship between the 3D bounding box and the 2D bounding box and propose a joint anchor box projection loss function, which will help to improve the overall performance of our method. The results of the KITTI dataset show that our proposed method can effectively improve the detection accuracy of occluded objects. Our method achieves 89.46%, 79.91% and 75.53% detection accuracy in the easy, moderate, and hard difficulty levels of the car category, and achieves a 6.97% performance improvement especially in the hard category with a high degree of occlusion. Our one-stage method does not need to rely on another refining stage, comparable to the accuracy of the two-stage method.https://www.mdpi.com/1424-8220/22/6/2366visual attention mechanismoccluded object detectionmulti-sensor fusion3D object detectionautonomous vehicles
spellingShingle	Xinglei He Xiaohan Zhang Yichun Wang Hongzeng Ji Xiuhui Duan Fen Guo Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects Sensors visual attention mechanism occluded object detection multi-sensor fusion 3D object detection autonomous vehicles
title	Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects
title_full	Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects
title_fullStr	Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects
title_full_unstemmed	Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects
title_short	Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects
title_sort	spatial attention frustum a 3d object detection method focusing on occluded objects
topic	visual attention mechanism occluded object detection multi-sensor fusion 3D object detection autonomous vehicles
url	https://www.mdpi.com/1424-8220/22/6/2366
work_keys_str_mv	AT xingleihe spatialattentionfrustuma3dobjectdetectionmethodfocusingonoccludedobjects AT xiaohanzhang spatialattentionfrustuma3dobjectdetectionmethodfocusingonoccludedobjects AT yichunwang spatialattentionfrustuma3dobjectdetectionmethodfocusingonoccludedobjects AT hongzengji spatialattentionfrustuma3dobjectdetectionmethodfocusingonoccludedobjects AT xiuhuiduan spatialattentionfrustuma3dobjectdetectionmethodfocusingonoccludedobjects AT fenguo spatialattentionfrustuma3dobjectdetectionmethodfocusingonoccludedobjects

Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects

Similar Items