TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs
Sparse convolution plays a pivotal role in emerging workloads, including point cloud processing in AR/VR, autonomous driving, and graph understanding in recommendation systems. Since the computation pattern is sparse and irregular, specialized high-performance kernels are required. Existing GPU libr...
Main Authors: | Tang, Haotian, Yang, Shang, Liu, Zhijian, Hong, Ke, Yu, Zhongming, Li, Xiuyu, Dai, Guohao, Wang, Yu, Han, Song |
---|---|
Format: | Article |
Language: | English |
Published: |
ACM|56th Annual IEEE/ACM International Symposium on Microarchitecture
2024
|
Online Access: | https://hdl.handle.net/1721.1/153260 |
Similar Items
-
OpSparse: A Highly Optimized Framework for Sparse General Matrix Multiplication on GPUs
by: Zhaoyang Du, et al.
Published: (2022-01-01) -
SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference
by: Wang, Ziheng
Published: (2022) -
Communication Optimization of Iterative Sparse Matrix-Vector Multiply on GPUs and FPGAs
by: Rafique, Abid, et al.
Published: (2015) -
Leveraging Memory Copy Overlap for Efficient Sparse Matrix-Vector Multiplication on GPUs
by: Guangsen Zeng, et al.
Published: (2023-08-01) -
Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
by: Tang, Haotian, et al.
Published: (2022)