A Checkpoint/Restart Scheme for CUDA Programs with Complex Computation States
Checkpoint/restart has been an effective mechanism to achieve fault tolerance for many long-running scientific applications. The common approach is to save computation states in memory and secondary storage for execution resumption. However, as the GPU plays a much bigger role in high performance co...
Main Authors: | Hai Jiang, Yulu Zhang, Jeff Jennes, Kuan-Ching Li |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2013-11-01
|
Series: | International Journal of Networked and Distributed Computing (IJNDC) |
Subjects: | |
Online Access: | https://www.atlantis-press.com/article/9665.pdf |
Similar Items
-
GSGP-CUDA — A CUDA framework for Geometric Semantic Genetic Programming
by: Leonardo Trujillo, et al.
Published: (2022-06-01) -
The Design and Implementation of an Improved Lightweight BLASTP on CUDA GPU
by: Xue Sun, et al.
Published: (2021-12-01) -
Optimizing Raytracing Algorithm Using CUDA
by: Sayed Ahmadreza Razian, et al.
Published: (2017-11-01) -
Basic concepts of CUDA technology
by: Andrey Maksimovich Kazennov
Published: (2010-09-01) -
Analysis of Fast Fourier Transformations algorithm for CUDA Architecture
by: Beatričė Andziulienė, et al.
Published: (2012-12-01)