Summary: | The self-attention mechanism has an excellent ability to capture long-range dependencies of data. To enable the self-attention mechanism in point cloud tasks to focus on both local and global contexts, we design a separable self-attention mechanism for point clouds by decomposing the construction of the attention map of a point cloud into two steps: Intra-patch Attention and Inter-patch Attention, the former computes the attention map of the tokens corresponding to each point in the local patch of the point cloud for mining local fine-grained semantic relationships, while the latter constructs the attention map among all the patches for mining long-distance interaction information. The two self-attention mechanisms work in parallel, focusing on both fine-grained local patterns and considering global scenes. Equipped with Intra-patch Attention and Inter-patch Attention modules, we construct a hierarchical end-to-end point cloud analysis architecture called Separable Transformer and conduct exhaustive experiments to demonstrate that the performance of the network proposed in this paper is highly competitive with state-of-the-art methods.
|