Re-Introducing BN Into Transformers for Vision Tasks
In recent years, Transformer-based models have exhibited significant advancements over previous models in natural language processing and vision tasks. This powerful methodology has also been extended to the 3D point cloud domain, where it can mitigate the inherent difficulties posed by the irregula...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10145769/ |
_version_ | 1827924852921597952 |
---|---|
author | Xue-Song Tang Xian-Lin Xie |
author_facet | Xue-Song Tang Xian-Lin Xie |
author_sort | Xue-Song Tang |
collection | DOAJ |
description | In recent years, Transformer-based models have exhibited significant advancements over previous models in natural language processing and vision tasks. This powerful methodology has also been extended to the 3D point cloud domain, where it can mitigate the inherent difficulties posed by the irregular and disorderly nature of the point clouds. However, the attention mechanism within the Transformer presents challenges for utilizing Batch Normalization (BN), as statistical information cannot be extracted efficiently from the data set. Thus, this study proposes a novel residual structure, ResBN, which can effectively handle 3D data. Additionally, to replace BN in the transformer for 2D image processing, we introduce the Patch Normalization (PN) technique. ResBN and PN are evaluated on 3D point cloud and 2D image datasets respectively through statistical experiments, demonstrating their efficacy in enhancing classification performance. |
first_indexed | 2024-03-13T05:16:45Z |
format | Article |
id | doaj.art-ff72bdaedd7841f7ae35256bc49d432d |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-13T05:16:45Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-ff72bdaedd7841f7ae35256bc49d432d2023-06-15T23:00:49ZengIEEEIEEE Access2169-35362023-01-0111584625846910.1109/ACCESS.2023.328361210145769Re-Introducing BN Into Transformers for Vision TasksXue-Song Tang0https://orcid.org/0000-0002-7594-2241Xian-Lin Xie1College of Information Science and Technology, Donghua University, Shanghai, ChinaCollege of Information Science and Technology, Donghua University, Shanghai, ChinaIn recent years, Transformer-based models have exhibited significant advancements over previous models in natural language processing and vision tasks. This powerful methodology has also been extended to the 3D point cloud domain, where it can mitigate the inherent difficulties posed by the irregular and disorderly nature of the point clouds. However, the attention mechanism within the Transformer presents challenges for utilizing Batch Normalization (BN), as statistical information cannot be extracted efficiently from the data set. Thus, this study proposes a novel residual structure, ResBN, which can effectively handle 3D data. Additionally, to replace BN in the transformer for 2D image processing, we introduce the Patch Normalization (PN) technique. ResBN and PN are evaluated on 3D point cloud and 2D image datasets respectively through statistical experiments, demonstrating their efficacy in enhancing classification performance.https://ieeexplore.ieee.org/document/10145769/Point cloudsnormalizationtransformers3D feature extraction |
spellingShingle | Xue-Song Tang Xian-Lin Xie Re-Introducing BN Into Transformers for Vision Tasks IEEE Access Point clouds normalization transformers 3D feature extraction |
title | Re-Introducing BN Into Transformers for Vision Tasks |
title_full | Re-Introducing BN Into Transformers for Vision Tasks |
title_fullStr | Re-Introducing BN Into Transformers for Vision Tasks |
title_full_unstemmed | Re-Introducing BN Into Transformers for Vision Tasks |
title_short | Re-Introducing BN Into Transformers for Vision Tasks |
title_sort | re introducing bn into transformers for vision tasks |
topic | Point clouds normalization transformers 3D feature extraction |
url | https://ieeexplore.ieee.org/document/10145769/ |
work_keys_str_mv | AT xuesongtang reintroducingbnintotransformersforvisiontasks AT xianlinxie reintroducingbnintotransformersforvisiontasks |