NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh
Following the significant success of the transformer in NLP and computer vision, this paper attempts to extend it to 3D triangle mesh. The aim is to determine the shape’s global representation using the transformer and capture the inherent manifold information. To this end, this paper proposes a nov...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-10-01
|
Series: | Symmetry |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-8994/14/10/2050 |
_version_ | 1797469842286051328 |
---|---|
author | Jiafu Zhuang Xiaofeng Liu Wei Zhuang |
author_facet | Jiafu Zhuang Xiaofeng Liu Wei Zhuang |
author_sort | Jiafu Zhuang |
collection | DOAJ |
description | Following the significant success of the transformer in NLP and computer vision, this paper attempts to extend it to 3D triangle mesh. The aim is to determine the shape’s global representation using the transformer and capture the inherent manifold information. To this end, this paper proposes a novel learning framework named Navigation Geodesic Distance Transformer (NGD-Transformer) for 3D mesh. Specifically, this approach combined farthest point sampling with the Voronoi segmentation algorithm to spawn uniform and non-overlapping manifold patches. However, the vertex number of these patches was inconsistent. Therefore, self-attention graph pooling is employed for sorting the vertices on each patch and screening out the most representative nodes, which were then reorganized according to their scores to generate tokens and their raw feature embeddings. To better exploit the manifold properties of the mesh, this paper further proposed a novel positional encoding called navigation geodesic distance positional encoding (NGD-PE), which encodes the geodesic distance between vertices relatively and spatial symmetrically. Subsequently, the raw feature embeddings and positional encodings were summed as input embeddings fed to the graph transformer encoder to determine the global representation of the shape. Experiments on several datasets were conducted, and the experimental results show the excellent performance of our proposed method. |
first_indexed | 2024-03-09T19:26:45Z |
format | Article |
id | doaj.art-5a84b694a65f40a1af44bf6009455f2a |
institution | Directory Open Access Journal |
issn | 2073-8994 |
language | English |
last_indexed | 2024-03-09T19:26:45Z |
publishDate | 2022-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Symmetry |
spelling | doaj.art-5a84b694a65f40a1af44bf6009455f2a2023-11-24T02:51:24ZengMDPI AGSymmetry2073-89942022-10-011410205010.3390/sym14102050NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle MeshJiafu Zhuang0Xiaofeng Liu1Wei Zhuang2School of Physics and Information Engineering, Quanzhou Normal University, Quanzhou 362000, ChinaSchool of Physics and Information Engineering, Quanzhou Normal University, Quanzhou 362000, ChinaDepartment of Social Statistics, University of Manchester, Oxford Rd., Manchester M13 9PL, UKFollowing the significant success of the transformer in NLP and computer vision, this paper attempts to extend it to 3D triangle mesh. The aim is to determine the shape’s global representation using the transformer and capture the inherent manifold information. To this end, this paper proposes a novel learning framework named Navigation Geodesic Distance Transformer (NGD-Transformer) for 3D mesh. Specifically, this approach combined farthest point sampling with the Voronoi segmentation algorithm to spawn uniform and non-overlapping manifold patches. However, the vertex number of these patches was inconsistent. Therefore, self-attention graph pooling is employed for sorting the vertices on each patch and screening out the most representative nodes, which were then reorganized according to their scores to generate tokens and their raw feature embeddings. To better exploit the manifold properties of the mesh, this paper further proposed a novel positional encoding called navigation geodesic distance positional encoding (NGD-PE), which encodes the geodesic distance between vertices relatively and spatial symmetrically. Subsequently, the raw feature embeddings and positional encodings were summed as input embeddings fed to the graph transformer encoder to determine the global representation of the shape. Experiments on several datasets were conducted, and the experimental results show the excellent performance of our proposed method.https://www.mdpi.com/2073-8994/14/10/2050graph transformer3d triangle meshpositional encoding3d shape segmentation3d shape classification |
spellingShingle | Jiafu Zhuang Xiaofeng Liu Wei Zhuang NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh Symmetry graph transformer 3d triangle mesh positional encoding 3d shape segmentation 3d shape classification |
title | NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh |
title_full | NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh |
title_fullStr | NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh |
title_full_unstemmed | NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh |
title_short | NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh |
title_sort | ngd transformer navigation geodesic distance positional encoding with self attention pooling for graph transformer on 3d triangle mesh |
topic | graph transformer 3d triangle mesh positional encoding 3d shape segmentation 3d shape classification |
url | https://www.mdpi.com/2073-8994/14/10/2050 |
work_keys_str_mv | AT jiafuzhuang ngdtransformernavigationgeodesicdistancepositionalencodingwithselfattentionpoolingforgraphtransformeron3dtrianglemesh AT xiaofengliu ngdtransformernavigationgeodesicdistancepositionalencodingwithselfattentionpoolingforgraphtransformeron3dtrianglemesh AT weizhuang ngdtransformernavigationgeodesicdistancepositionalencodingwithselfattentionpoolingforgraphtransformeron3dtrianglemesh |