2021 IEEE 27th International Symposium on on-Line Testing and Robust System Design (IOLTS) 2021
DOI: 10.1109/iolts52814.2021.9486693
|View full text |Cite
|
Sign up to set email alerts
|

FPGA Checkpointing for Scientific Computing

Abstract: The use of FPGAs in computational workloads is becoming increasingly popular due to the flexibility of these devices in comparison to ASICs, and their low power consumption compared to GPUs and CPUs. However, scientific applications run for long periods of time and the hardware is always subject to failures due to either soft or hard errors. Thus, it is important to protect these long running jobs with fault tolerance mechanisms. Checkpoint-Restart is a popular technique in high-performance computing that allo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 18 publications
0
1
0
Order By: Relevance
“…FPGA support provided within OmpSs 21 delivers fast checkpointing on the FPGA via on‐chip memory and high‐level kernel code annotations. However, their work leverages this mechanism only for fault tolerance and does not consider or support preemptive scheduling.…”
Section: Background and Related Workmentioning
confidence: 99%
“…FPGA support provided within OmpSs 21 delivers fast checkpointing on the FPGA via on‐chip memory and high‐level kernel code annotations. However, their work leverages this mechanism only for fault tolerance and does not consider or support preemptive scheduling.…”
Section: Background and Related Workmentioning
confidence: 99%