Learning temporal variations for 4D point cloud segmentation

LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides cr...

Full description

Bibliographic Details
Main Authors:	Shi, Hanyu, Wei, Jiacheng, Wang, Hao, Liu, Fayao, Lin, Guosheng
Other Authors:	School of Computer Science and Engineering
Format:	Journal Article
Language:	English
Published:	2024
Subjects:	Computer and Information Science 4D point cloud Semantic segmentation
Online Access:	https://hdl.handle.net/10356/179442

_version_	1826112085895938048
author	Shi, Hanyu Wei, Jiacheng Wang, Hao Liu, Fayao Lin, Guosheng
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Shi, Hanyu Wei, Jiacheng Wang, Hao Liu, Fayao Lin, Guosheng
author_sort	Shi, Hanyu
collection	NTU
description	LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides crucial knowledge for 3D scene perceptions, especially in the driving scenario. In this paper, we focus on spatial and temporal variations to better explore temporal information across 3D frames. We design a temporal variation-aware interpolation module and a temporal voxel-point refinement module to capture the temporal variation in the 4D point cloud. The temporal variation-aware interpolation generates local features from the previous and current frames by capturing spatial coherence and temporal variation information. The temporal voxel-point refinement module builds a temporal graph on the 3D point cloud sequences and captures the temporal variation with a graph convolution module, transforming coarse voxel-level predictions into fine point-level predictions. With our proposed modules, we achieve superior performances on SemanticKITTI, SemantiPOSS and NuScenes.
first_indexed	2024-10-01T03:01:32Z
format	Journal Article
id	ntu-10356/179442
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T03:01:32Z
publishDate	2024
record_format	dspace
spelling	ntu-10356/1794422024-07-31T04:53:41Z Learning temporal variations for 4D point cloud segmentation Shi, Hanyu Wei, Jiacheng Wang, Hao Liu, Fayao Lin, Guosheng School of Computer Science and Engineering School of Electrical and Electronic Engineering Computer and Information Science 4D point cloud Semantic segmentation LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides crucial knowledge for 3D scene perceptions, especially in the driving scenario. In this paper, we focus on spatial and temporal variations to better explore temporal information across 3D frames. We design a temporal variation-aware interpolation module and a temporal voxel-point refinement module to capture the temporal variation in the 4D point cloud. The temporal variation-aware interpolation generates local features from the previous and current frames by capturing spatial coherence and temporal variation information. The temporal voxel-point refinement module builds a temporal graph on the 3D point cloud sequences and captures the temporal variation with a graph convolution module, transforming coarse voxel-level predictions into fine point-level predictions. With our proposed modules, we achieve superior performances on SemanticKITTI, SemantiPOSS and NuScenes. Agency for Science, Technology and Research (ASTAR) This research is supported by the Agency for Science, Technology and Research (ASTAR) under its MTC Programmatic Funds (Grant No. M23L7b0021). 2024-07-31T04:53:41Z 2024-07-31T04:53:41Z 2024 Journal Article Shi, H., Wei, J., Wang, H., Liu, F. & Lin, G. (2024). Learning temporal variations for 4D point cloud segmentation. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02149-w 0920-5691 https://hdl.handle.net/10356/179442 10.1007/s11263-024-02149-w 2-s2.0-85196535409 en M23L7b0021 International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved,
spellingShingle	Computer and Information Science 4D point cloud Semantic segmentation Shi, Hanyu Wei, Jiacheng Wang, Hao Liu, Fayao Lin, Guosheng Learning temporal variations for 4D point cloud segmentation
title	Learning temporal variations for 4D point cloud segmentation
title_full	Learning temporal variations for 4D point cloud segmentation
title_fullStr	Learning temporal variations for 4D point cloud segmentation
title_full_unstemmed	Learning temporal variations for 4D point cloud segmentation
title_short	Learning temporal variations for 4D point cloud segmentation
title_sort	learning temporal variations for 4d point cloud segmentation
topic	Computer and Information Science 4D point cloud Semantic segmentation
url	https://hdl.handle.net/10356/179442
work_keys_str_mv	AT shihanyu learningtemporalvariationsfor4dpointcloudsegmentation AT weijiacheng learningtemporalvariationsfor4dpointcloudsegmentation AT wanghao learningtemporalvariationsfor4dpointcloudsegmentation AT liufayao learningtemporalvariationsfor4dpointcloudsegmentation AT linguosheng learningtemporalvariationsfor4dpointcloudsegmentation

Learning temporal variations for 4D point cloud segmentation

Similar Items