2023
DOI: 10.21203/rs.3.rs-2749012/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Scalable, Distributed Monitoring Framework for HPC Clusters Using Redfish-Nagios Integration

Abstract: Current monitoring tools for high-performance computing (HPC) systems are often inefficient in terms of scalability and interfacing with modern data center management APIs. This inefficiency leads to a lack of effective management of the infrastructure of modern data centers. Nagios is one of the widely used industry-standard tools for data center infrastructure monitoring, which mainly includes monitoring of nodes and associated hardware and software components. However, current Nagios monitoring has special … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 20 publications
0
1
0
Order By: Relevance
“…Additionally, the relevant scientific literature includes articles that report technical contributions concerning the effective problem of network and infrastructure monitoring. As an example, Ali et al [59] proposed Redfish-Nagios, a scalable out-of-band monitoring tool for modern high-performance computing (HPC) systems. More precisely, the described approach conforms to the Redfish telemetry model [60], which allows for a more efficient monitoring of next-generation scalable HPC systems.…”
Section: Network Monitoringmentioning
confidence: 99%
“…Additionally, the relevant scientific literature includes articles that report technical contributions concerning the effective problem of network and infrastructure monitoring. As an example, Ali et al [59] proposed Redfish-Nagios, a scalable out-of-band monitoring tool for modern high-performance computing (HPC) systems. More precisely, the described approach conforms to the Redfish telemetry model [60], which allows for a more efficient monitoring of next-generation scalable HPC systems.…”
Section: Network Monitoringmentioning
confidence: 99%