2022
DOI: 10.1016/j.micpro.2022.104586
|View full text |Cite
|
Sign up to set email alerts
|

LIMITLESS — LIght-weight MonItoring Tool for LargE Scale Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…A prominent example, currently deployed on production systems at LRZ and extended in the projects DEEP-SEA [37] and RE-GALE [38], is the Data Center Data Base (DCDB) [39], [40], which is capable of routinely tracking millions of sensors on large scale production systems, such as SuperMUC-NG, using technologies from the IoT space combined with a federation of time series databases built in top of Cassandra. Similarly, the ADMIRE project [41] is building an entirely new measurement infrastructure relying on the Prometheus time-series database (TSDB) connected to a node-level aggregating push gateway coupled with LIMITLESS [42] for node-level monitoring and high-speed spatial reduction based on a tree-based overlay network (TBON). Although such comprehensive monitoring capabilities are not yet commonly used in all parallel systems, previous experiments have demonstrated the usefulness of malleability at the application level.…”
Section: B Monitoring and Modelingmentioning
confidence: 99%
“…A prominent example, currently deployed on production systems at LRZ and extended in the projects DEEP-SEA [37] and RE-GALE [38], is the Data Center Data Base (DCDB) [39], [40], which is capable of routinely tracking millions of sensors on large scale production systems, such as SuperMUC-NG, using technologies from the IoT space combined with a federation of time series databases built in top of Cassandra. Similarly, the ADMIRE project [41] is building an entirely new measurement infrastructure relying on the Prometheus time-series database (TSDB) connected to a node-level aggregating push gateway coupled with LIMITLESS [42] for node-level monitoring and high-speed spatial reduction based on a tree-based overlay network (TBON). Although such comprehensive monitoring capabilities are not yet commonly used in all parallel systems, previous experiments have demonstrated the usefulness of malleability at the application level.…”
Section: B Monitoring and Modelingmentioning
confidence: 99%
“…In addition, this period can be modified online without the necessity of restarting the system or the monitor and it is possible to have one different for each node. The monitor overhead in the compute nodes is low (less than 1% on CPU consumption and a memory footprint of 3890 KB in resident memory (Cascajo et al, 2021(Cascajo et al, , 2022, which means that the monitoring has a minimal impact on the applications that are being executed in the same compute nodes.…”
Section: System Monitormentioning
confidence: 99%
“…The second alternative was implemented in Cascajo et al (2021) and uses the generated application models to predict the future performance of the applications in order support the decision-making process related to the application scheduling. This alternative does not produce good results if the accuracy of the predictor is poor.…”
Section: Building Synthetic Micro-benchmarks: Application Clonesmentioning
confidence: 99%
See 1 more Smart Citation