NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh

Following the significant success of the transformer in NLP and computer vision, this paper attempts to extend it to 3D triangle mesh. The aim is to determine the shape’s global representation using the transformer and capture the inherent manifold information. To this end, this paper proposes a nov...

Full description

Bibliographic Details
Main Authors: Jiafu Zhuang, Xiaofeng Liu, Wei Zhuang
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/14/10/2050
_version_ 1797469842286051328
author Jiafu Zhuang
Xiaofeng Liu
Wei Zhuang
author_facet Jiafu Zhuang
Xiaofeng Liu
Wei Zhuang
author_sort Jiafu Zhuang
collection DOAJ
description Following the significant success of the transformer in NLP and computer vision, this paper attempts to extend it to 3D triangle mesh. The aim is to determine the shape’s global representation using the transformer and capture the inherent manifold information. To this end, this paper proposes a novel learning framework named Navigation Geodesic Distance Transformer (NGD-Transformer) for 3D mesh. Specifically, this approach combined farthest point sampling with the Voronoi segmentation algorithm to spawn uniform and non-overlapping manifold patches. However, the vertex number of these patches was inconsistent. Therefore, self-attention graph pooling is employed for sorting the vertices on each patch and screening out the most representative nodes, which were then reorganized according to their scores to generate tokens and their raw feature embeddings. To better exploit the manifold properties of the mesh, this paper further proposed a novel positional encoding called navigation geodesic distance positional encoding (NGD-PE), which encodes the geodesic distance between vertices relatively and spatial symmetrically. Subsequently, the raw feature embeddings and positional encodings were summed as input embeddings fed to the graph transformer encoder to determine the global representation of the shape. Experiments on several datasets were conducted, and the experimental results show the excellent performance of our proposed method.
first_indexed 2024-03-09T19:26:45Z
format Article
id doaj.art-5a84b694a65f40a1af44bf6009455f2a
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-03-09T19:26:45Z
publishDate 2022-10-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-5a84b694a65f40a1af44bf6009455f2a2023-11-24T02:51:24ZengMDPI AGSymmetry2073-89942022-10-011410205010.3390/sym14102050NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle MeshJiafu Zhuang0Xiaofeng Liu1Wei Zhuang2School of Physics and Information Engineering, Quanzhou Normal University, Quanzhou 362000, ChinaSchool of Physics and Information Engineering, Quanzhou Normal University, Quanzhou 362000, ChinaDepartment of Social Statistics, University of Manchester, Oxford Rd., Manchester M13 9PL, UKFollowing the significant success of the transformer in NLP and computer vision, this paper attempts to extend it to 3D triangle mesh. The aim is to determine the shape’s global representation using the transformer and capture the inherent manifold information. To this end, this paper proposes a novel learning framework named Navigation Geodesic Distance Transformer (NGD-Transformer) for 3D mesh. Specifically, this approach combined farthest point sampling with the Voronoi segmentation algorithm to spawn uniform and non-overlapping manifold patches. However, the vertex number of these patches was inconsistent. Therefore, self-attention graph pooling is employed for sorting the vertices on each patch and screening out the most representative nodes, which were then reorganized according to their scores to generate tokens and their raw feature embeddings. To better exploit the manifold properties of the mesh, this paper further proposed a novel positional encoding called navigation geodesic distance positional encoding (NGD-PE), which encodes the geodesic distance between vertices relatively and spatial symmetrically. Subsequently, the raw feature embeddings and positional encodings were summed as input embeddings fed to the graph transformer encoder to determine the global representation of the shape. Experiments on several datasets were conducted, and the experimental results show the excellent performance of our proposed method.https://www.mdpi.com/2073-8994/14/10/2050graph transformer3d triangle meshpositional encoding3d shape segmentation3d shape classification
spellingShingle Jiafu Zhuang
Xiaofeng Liu
Wei Zhuang
NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh
Symmetry
graph transformer
3d triangle mesh
positional encoding
3d shape segmentation
3d shape classification
title NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh
title_full NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh
title_fullStr NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh
title_full_unstemmed NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh
title_short NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh
title_sort ngd transformer navigation geodesic distance positional encoding with self attention pooling for graph transformer on 3d triangle mesh
topic graph transformer
3d triangle mesh
positional encoding
3d shape segmentation
3d shape classification
url https://www.mdpi.com/2073-8994/14/10/2050
work_keys_str_mv AT jiafuzhuang ngdtransformernavigationgeodesicdistancepositionalencodingwithselfattentionpoolingforgraphtransformeron3dtrianglemesh
AT xiaofengliu ngdtransformernavigationgeodesicdistancepositionalencodingwithselfattentionpoolingforgraphtransformeron3dtrianglemesh
AT weizhuang ngdtransformernavigationgeodesicdistancepositionalencodingwithselfattentionpoolingforgraphtransformeron3dtrianglemesh