TR-Net: A Transformer-Based Neural Network for Point Cloud Processing
Point cloud is a versatile geometric representation that could be applied in computer vision tasks. On account of the disorder of point cloud, it is challenging to design a deep neural network used in point cloud analysis. Furthermore, most existing frameworks for point cloud processing either hardl...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-06-01
|
Series: | Machines |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-1702/10/7/517 |
_version_ | 1797433391814017024 |
---|---|
author | Luyao Liu Enqing Chen Yingqiang Ding |
author_facet | Luyao Liu Enqing Chen Yingqiang Ding |
author_sort | Luyao Liu |
collection | DOAJ |
description | Point cloud is a versatile geometric representation that could be applied in computer vision tasks. On account of the disorder of point cloud, it is challenging to design a deep neural network used in point cloud analysis. Furthermore, most existing frameworks for point cloud processing either hardly consider the local neighboring information or ignore context-aware and spatially-aware features. To deal with the above problems, we propose a novel point cloud processing architecture named TR-Net, which is based on transformer. This architecture reformulates the point cloud processing task as a set-to-set translation problem. TR-Net directly operates on raw point clouds without any data transformation or annotation, which reduces the consumption of computing resources and memory usage. Firstly, a neighborhood embedding backbone is designed to effectively extract the local neighboring information from point cloud. Then, an attention-based sub-network is constructed to better learn a semantically abundant and discriminatory representation from embedded features. Finally, effective global features are yielded through feeding the features extracted by attention-based sub-network into a residual backbone. For different downstream tasks, we build different decoders. Extensive experiments on the public datasets illustrate that our approach outperforms other state-of-the-art methods. For example, our TR-Net performs 93.1% overall accuracy on the ModelNet40 dataset and the TR-Net archives a mIou of 85.3% on the ShapeNet dataset for part segmentation. |
first_indexed | 2024-03-09T10:16:22Z |
format | Article |
id | doaj.art-6bb5c4807171438d9f6c5c5701c2034c |
institution | Directory Open Access Journal |
issn | 2075-1702 |
language | English |
last_indexed | 2024-03-09T10:16:22Z |
publishDate | 2022-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Machines |
spelling | doaj.art-6bb5c4807171438d9f6c5c5701c2034c2023-12-01T22:22:17ZengMDPI AGMachines2075-17022022-06-0110751710.3390/machines10070517TR-Net: A Transformer-Based Neural Network for Point Cloud ProcessingLuyao Liu0Enqing Chen1Yingqiang Ding2School of Information Engineering, Zhengzhou University, No. 100 Science Avenue, Zhengzhou 450001, ChinaSchool of Information Engineering, Zhengzhou University, No. 100 Science Avenue, Zhengzhou 450001, ChinaSchool of Information Engineering, Zhengzhou University, No. 100 Science Avenue, Zhengzhou 450001, ChinaPoint cloud is a versatile geometric representation that could be applied in computer vision tasks. On account of the disorder of point cloud, it is challenging to design a deep neural network used in point cloud analysis. Furthermore, most existing frameworks for point cloud processing either hardly consider the local neighboring information or ignore context-aware and spatially-aware features. To deal with the above problems, we propose a novel point cloud processing architecture named TR-Net, which is based on transformer. This architecture reformulates the point cloud processing task as a set-to-set translation problem. TR-Net directly operates on raw point clouds without any data transformation or annotation, which reduces the consumption of computing resources and memory usage. Firstly, a neighborhood embedding backbone is designed to effectively extract the local neighboring information from point cloud. Then, an attention-based sub-network is constructed to better learn a semantically abundant and discriminatory representation from embedded features. Finally, effective global features are yielded through feeding the features extracted by attention-based sub-network into a residual backbone. For different downstream tasks, we build different decoders. Extensive experiments on the public datasets illustrate that our approach outperforms other state-of-the-art methods. For example, our TR-Net performs 93.1% overall accuracy on the ModelNet40 dataset and the TR-Net archives a mIou of 85.3% on the ShapeNet dataset for part segmentation.https://www.mdpi.com/2075-1702/10/7/517point clouddeep learningclassificationpart segmentationtransformer |
spellingShingle | Luyao Liu Enqing Chen Yingqiang Ding TR-Net: A Transformer-Based Neural Network for Point Cloud Processing Machines point cloud deep learning classification part segmentation transformer |
title | TR-Net: A Transformer-Based Neural Network for Point Cloud Processing |
title_full | TR-Net: A Transformer-Based Neural Network for Point Cloud Processing |
title_fullStr | TR-Net: A Transformer-Based Neural Network for Point Cloud Processing |
title_full_unstemmed | TR-Net: A Transformer-Based Neural Network for Point Cloud Processing |
title_short | TR-Net: A Transformer-Based Neural Network for Point Cloud Processing |
title_sort | tr net a transformer based neural network for point cloud processing |
topic | point cloud deep learning classification part segmentation transformer |
url | https://www.mdpi.com/2075-1702/10/7/517 |
work_keys_str_mv | AT luyaoliu trnetatransformerbasedneuralnetworkforpointcloudprocessing AT enqingchen trnetatransformerbasedneuralnetworkforpointcloudprocessing AT yingqiangding trnetatransformerbasedneuralnetworkforpointcloudprocessing |