Learning temporal variations for 4D point cloud segmentation
LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides cr...
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Journal Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/179442 |
_version_ | 1826112085895938048 |
---|---|
author | Shi, Hanyu Wei, Jiacheng Wang, Hao Liu, Fayao Lin, Guosheng |
author2 | School of Computer Science and Engineering |
author_facet | School of Computer Science and Engineering Shi, Hanyu Wei, Jiacheng Wang, Hao Liu, Fayao Lin, Guosheng |
author_sort | Shi, Hanyu |
collection | NTU |
description | LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides crucial knowledge for 3D scene perceptions, especially in the driving scenario. In this paper, we focus on spatial and temporal variations to better explore temporal information across 3D frames. We design a temporal variation-aware interpolation module and a temporal voxel-point refinement module to capture the temporal variation in the 4D point cloud. The temporal variation-aware interpolation generates local features from the previous and current frames by capturing spatial coherence and temporal variation information. The temporal voxel-point refinement module builds a temporal graph on the 3D point cloud sequences and captures the temporal variation with a graph convolution module, transforming coarse voxel-level predictions into fine point-level predictions. With our proposed modules, we achieve superior performances on SemanticKITTI, SemantiPOSS and NuScenes. |
first_indexed | 2024-10-01T03:01:32Z |
format | Journal Article |
id | ntu-10356/179442 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T03:01:32Z |
publishDate | 2024 |
record_format | dspace |
spelling | ntu-10356/1794422024-07-31T04:53:41Z Learning temporal variations for 4D point cloud segmentation Shi, Hanyu Wei, Jiacheng Wang, Hao Liu, Fayao Lin, Guosheng School of Computer Science and Engineering School of Electrical and Electronic Engineering Computer and Information Science 4D point cloud Semantic segmentation LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides crucial knowledge for 3D scene perceptions, especially in the driving scenario. In this paper, we focus on spatial and temporal variations to better explore temporal information across 3D frames. We design a temporal variation-aware interpolation module and a temporal voxel-point refinement module to capture the temporal variation in the 4D point cloud. The temporal variation-aware interpolation generates local features from the previous and current frames by capturing spatial coherence and temporal variation information. The temporal voxel-point refinement module builds a temporal graph on the 3D point cloud sequences and captures the temporal variation with a graph convolution module, transforming coarse voxel-level predictions into fine point-level predictions. With our proposed modules, we achieve superior performances on SemanticKITTI, SemantiPOSS and NuScenes. Agency for Science, Technology and Research (A*STAR) This research is supported by the Agency for Science, Technology and Research (A*STAR) under its MTC Programmatic Funds (Grant No. M23L7b0021). 2024-07-31T04:53:41Z 2024-07-31T04:53:41Z 2024 Journal Article Shi, H., Wei, J., Wang, H., Liu, F. & Lin, G. (2024). Learning temporal variations for 4D point cloud segmentation. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02149-w 0920-5691 https://hdl.handle.net/10356/179442 10.1007/s11263-024-02149-w 2-s2.0-85196535409 en M23L7b0021 International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved, |
spellingShingle | Computer and Information Science 4D point cloud Semantic segmentation Shi, Hanyu Wei, Jiacheng Wang, Hao Liu, Fayao Lin, Guosheng Learning temporal variations for 4D point cloud segmentation |
title | Learning temporal variations for 4D point cloud segmentation |
title_full | Learning temporal variations for 4D point cloud segmentation |
title_fullStr | Learning temporal variations for 4D point cloud segmentation |
title_full_unstemmed | Learning temporal variations for 4D point cloud segmentation |
title_short | Learning temporal variations for 4D point cloud segmentation |
title_sort | learning temporal variations for 4d point cloud segmentation |
topic | Computer and Information Science 4D point cloud Semantic segmentation |
url | https://hdl.handle.net/10356/179442 |
work_keys_str_mv | AT shihanyu learningtemporalvariationsfor4dpointcloudsegmentation AT weijiacheng learningtemporalvariationsfor4dpointcloudsegmentation AT wanghao learningtemporalvariationsfor4dpointcloudsegmentation AT liufayao learningtemporalvariationsfor4dpointcloudsegmentation AT linguosheng learningtemporalvariationsfor4dpointcloudsegmentation |