2018
DOI: 10.1007/978-3-030-10549-5_62
|View full text |Cite
|
Sign up to set email alerts
|

FINJ: A Fault Injection Tool for HPC Systems

Abstract: We present FINJ, a high-level fault injection tool for High-Performance Computing (HPC) systems, with a focus on the management of complex experiments. FINJ provides support for custom workloads and allows generation of anomalous conditions through the use of fault-triggering executable programs. FINJ can also be integrated seamlessly with most other lower-level fault injection tools, allowing users to create and monitor a variety of highly-complex and diverse fault conditions in HPC systems that would be diff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
19
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 10 publications
(20 citation statements)
references
References 15 publications
1
19
0
Order By: Relevance
“…Fault Programs. All the fault programs used to reproduce anomalous conditions on Antarex are available at the FINJ Github repository [15]. As in [17], each program can also operate in a low-intensity mode, thus doubling the number of possible fault conditions.…”
Section: Features Of the Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…Fault Programs. All the fault programs used to reproduce anomalous conditions on Antarex are available at the FINJ Github repository [15]. As in [17], each program can also operate in a low-intensity mode, thus doubling the number of possible fault conditions.…”
Section: Features Of the Datasetmentioning
confidence: 99%
“…This number does not include the metrics collected by the procinterrupts plugin, which were found to be irrelevant after preliminary testing. All the scripts used to process the data are available on the FINJ Github repository [15].…”
Section: Creation Of Featuresmentioning
confidence: 99%
“…-FINJ Fault injection is a tool for high performance computing [14] in Python that is implemented as an oriented object, with high level programming language, that is used on many operating systems majors.…”
Section: Swifi Toolsmentioning
confidence: 99%
“…To perform variations of HPC systems and observe their behavior, Netti et al have present FINJ and implemented in Python in a high-level fault injection tool for fault injection and monitoring in the systems [14]. FINJ has been designed implemented to an open source of easy use in Python tool at HPC systems.…”
Section: Analysis Of Literaturementioning
confidence: 99%
See 1 more Smart Citation