Benchmarking GPU Tensor Cores on General Matrix Multiplication Kernels through CUTLASS

GPUs have been broadly used to accelerate big data analytics, scientific computing and machine intelligence. Particularly, matrix multiplication and convolution are two principal operations that use a large proportion of steps in modern data analysis and deep neural networks. These performance-criti...

Full description

Bibliographic Details
Main Authors: Xuanteng Huang, Xianwei Zhang, Panfei Yang, Nong Xiao
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/24/13022