TR-Net: A Transformer-Based Neural Network for Point Cloud Processing

Point cloud is a versatile geometric representation that could be applied in computer vision tasks. On account of the disorder of point cloud, it is challenging to design a deep neural network used in point cloud analysis. Furthermore, most existing frameworks for point cloud processing either hardl...

Full description

Bibliographic Details
Main Authors: Luyao Liu, Enqing Chen, Yingqiang Ding
Format: Article
Language:English
Published: MDPI AG 2022-06-01
Series:Machines
Subjects:
Online Access:https://www.mdpi.com/2075-1702/10/7/517
_version_ 1797433391814017024
author Luyao Liu
Enqing Chen
Yingqiang Ding
author_facet Luyao Liu
Enqing Chen
Yingqiang Ding
author_sort Luyao Liu
collection DOAJ
description Point cloud is a versatile geometric representation that could be applied in computer vision tasks. On account of the disorder of point cloud, it is challenging to design a deep neural network used in point cloud analysis. Furthermore, most existing frameworks for point cloud processing either hardly consider the local neighboring information or ignore context-aware and spatially-aware features. To deal with the above problems, we propose a novel point cloud processing architecture named TR-Net, which is based on transformer. This architecture reformulates the point cloud processing task as a set-to-set translation problem. TR-Net directly operates on raw point clouds without any data transformation or annotation, which reduces the consumption of computing resources and memory usage. Firstly, a neighborhood embedding backbone is designed to effectively extract the local neighboring information from point cloud. Then, an attention-based sub-network is constructed to better learn a semantically abundant and discriminatory representation from embedded features. Finally, effective global features are yielded through feeding the features extracted by attention-based sub-network into a residual backbone. For different downstream tasks, we build different decoders. Extensive experiments on the public datasets illustrate that our approach outperforms other state-of-the-art methods. For example, our TR-Net performs 93.1% overall accuracy on the ModelNet40 dataset and the TR-Net archives a mIou of 85.3% on the ShapeNet dataset for part segmentation.
first_indexed 2024-03-09T10:16:22Z
format Article
id doaj.art-6bb5c4807171438d9f6c5c5701c2034c
institution Directory Open Access Journal
issn 2075-1702
language English
last_indexed 2024-03-09T10:16:22Z
publishDate 2022-06-01
publisher MDPI AG
record_format Article
series Machines
spelling doaj.art-6bb5c4807171438d9f6c5c5701c2034c2023-12-01T22:22:17ZengMDPI AGMachines2075-17022022-06-0110751710.3390/machines10070517TR-Net: A Transformer-Based Neural Network for Point Cloud ProcessingLuyao Liu0Enqing Chen1Yingqiang Ding2School of Information Engineering, Zhengzhou University, No. 100 Science Avenue, Zhengzhou 450001, ChinaSchool of Information Engineering, Zhengzhou University, No. 100 Science Avenue, Zhengzhou 450001, ChinaSchool of Information Engineering, Zhengzhou University, No. 100 Science Avenue, Zhengzhou 450001, ChinaPoint cloud is a versatile geometric representation that could be applied in computer vision tasks. On account of the disorder of point cloud, it is challenging to design a deep neural network used in point cloud analysis. Furthermore, most existing frameworks for point cloud processing either hardly consider the local neighboring information or ignore context-aware and spatially-aware features. To deal with the above problems, we propose a novel point cloud processing architecture named TR-Net, which is based on transformer. This architecture reformulates the point cloud processing task as a set-to-set translation problem. TR-Net directly operates on raw point clouds without any data transformation or annotation, which reduces the consumption of computing resources and memory usage. Firstly, a neighborhood embedding backbone is designed to effectively extract the local neighboring information from point cloud. Then, an attention-based sub-network is constructed to better learn a semantically abundant and discriminatory representation from embedded features. Finally, effective global features are yielded through feeding the features extracted by attention-based sub-network into a residual backbone. For different downstream tasks, we build different decoders. Extensive experiments on the public datasets illustrate that our approach outperforms other state-of-the-art methods. For example, our TR-Net performs 93.1% overall accuracy on the ModelNet40 dataset and the TR-Net archives a mIou of 85.3% on the ShapeNet dataset for part segmentation.https://www.mdpi.com/2075-1702/10/7/517point clouddeep learningclassificationpart segmentationtransformer
spellingShingle Luyao Liu
Enqing Chen
Yingqiang Ding
TR-Net: A Transformer-Based Neural Network for Point Cloud Processing
Machines
point cloud
deep learning
classification
part segmentation
transformer
title TR-Net: A Transformer-Based Neural Network for Point Cloud Processing
title_full TR-Net: A Transformer-Based Neural Network for Point Cloud Processing
title_fullStr TR-Net: A Transformer-Based Neural Network for Point Cloud Processing
title_full_unstemmed TR-Net: A Transformer-Based Neural Network for Point Cloud Processing
title_short TR-Net: A Transformer-Based Neural Network for Point Cloud Processing
title_sort tr net a transformer based neural network for point cloud processing
topic point cloud
deep learning
classification
part segmentation
transformer
url https://www.mdpi.com/2075-1702/10/7/517
work_keys_str_mv AT luyaoliu trnetatransformerbasedneuralnetworkforpointcloudprocessing
AT enqingchen trnetatransformerbasedneuralnetworkforpointcloudprocessing
AT yingqiangding trnetatransformerbasedneuralnetworkforpointcloudprocessing