PERFORMANCE ENHANCEMENT OF CUDA APPLICATIONS BY OVERLAPPING DATA TRANSFER AND KERNEL EXECUTION
The CPU-GPU combination is a widely used heterogeneous computing system in which the CPU and GPU have different address spaces. Since the GPU cannot directly access the CPU memory, prior to invoking the GPU function the input data must be available on the GPU memory. On completion of GPU function, t...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Polish Association for Knowledge Promotion
2021-09-01
|
Series: | Applied Computer Science |
Subjects: | |
Online Access: | http://www.acs.pollub.pl/pdf/v17n3/1.pdf |