Learning temporal variations for 4D point cloud segmentation

LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides cr...

Full description

Bibliographic Details
Main Authors: Shi, Hanyu, Wei, Jiacheng, Wang, Hao, Liu, Fayao, Lin, Guosheng
Other Authors: School of Computer Science and Engineering
Format: Journal Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/179442
_version_ 1826112085895938048
author Shi, Hanyu
Wei, Jiacheng
Wang, Hao
Liu, Fayao
Lin, Guosheng
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Shi, Hanyu
Wei, Jiacheng
Wang, Hao
Liu, Fayao
Lin, Guosheng
author_sort Shi, Hanyu
collection NTU
description LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides crucial knowledge for 3D scene perceptions, especially in the driving scenario. In this paper, we focus on spatial and temporal variations to better explore temporal information across 3D frames. We design a temporal variation-aware interpolation module and a temporal voxel-point refinement module to capture the temporal variation in the 4D point cloud. The temporal variation-aware interpolation generates local features from the previous and current frames by capturing spatial coherence and temporal variation information. The temporal voxel-point refinement module builds a temporal graph on the 3D point cloud sequences and captures the temporal variation with a graph convolution module, transforming coarse voxel-level predictions into fine point-level predictions. With our proposed modules, we achieve superior performances on SemanticKITTI, SemantiPOSS and NuScenes.
first_indexed 2024-10-01T03:01:32Z
format Journal Article
id ntu-10356/179442
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:01:32Z
publishDate 2024
record_format dspace
spelling ntu-10356/1794422024-07-31T04:53:41Z Learning temporal variations for 4D point cloud segmentation Shi, Hanyu Wei, Jiacheng Wang, Hao Liu, Fayao Lin, Guosheng School of Computer Science and Engineering School of Electrical and Electronic Engineering Computer and Information Science 4D point cloud Semantic segmentation LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides crucial knowledge for 3D scene perceptions, especially in the driving scenario. In this paper, we focus on spatial and temporal variations to better explore temporal information across 3D frames. We design a temporal variation-aware interpolation module and a temporal voxel-point refinement module to capture the temporal variation in the 4D point cloud. The temporal variation-aware interpolation generates local features from the previous and current frames by capturing spatial coherence and temporal variation information. The temporal voxel-point refinement module builds a temporal graph on the 3D point cloud sequences and captures the temporal variation with a graph convolution module, transforming coarse voxel-level predictions into fine point-level predictions. With our proposed modules, we achieve superior performances on SemanticKITTI, SemantiPOSS and NuScenes. Agency for Science, Technology and Research (A*STAR) This research is supported by the Agency for Science, Technology and Research (A*STAR) under its MTC Programmatic Funds (Grant No. M23L7b0021). 2024-07-31T04:53:41Z 2024-07-31T04:53:41Z 2024 Journal Article Shi, H., Wei, J., Wang, H., Liu, F. & Lin, G. (2024). Learning temporal variations for 4D point cloud segmentation. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02149-w 0920-5691 https://hdl.handle.net/10356/179442 10.1007/s11263-024-02149-w 2-s2.0-85196535409 en M23L7b0021 International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved,
spellingShingle Computer and Information Science
4D point cloud
Semantic segmentation
Shi, Hanyu
Wei, Jiacheng
Wang, Hao
Liu, Fayao
Lin, Guosheng
Learning temporal variations for 4D point cloud segmentation
title Learning temporal variations for 4D point cloud segmentation
title_full Learning temporal variations for 4D point cloud segmentation
title_fullStr Learning temporal variations for 4D point cloud segmentation
title_full_unstemmed Learning temporal variations for 4D point cloud segmentation
title_short Learning temporal variations for 4D point cloud segmentation
title_sort learning temporal variations for 4d point cloud segmentation
topic Computer and Information Science
4D point cloud
Semantic segmentation
url https://hdl.handle.net/10356/179442
work_keys_str_mv AT shihanyu learningtemporalvariationsfor4dpointcloudsegmentation
AT weijiacheng learningtemporalvariationsfor4dpointcloudsegmentation
AT wanghao learningtemporalvariationsfor4dpointcloudsegmentation
AT liufayao learningtemporalvariationsfor4dpointcloudsegmentation
AT linguosheng learningtemporalvariationsfor4dpointcloudsegmentation