2018
DOI: 10.1007/978-3-319-77935-5_22
|View full text |Cite
|
Sign up to set email alerts
|

On the Timed Analysis of Big-Data Applications

Abstract: Apache Spark is one of the best-known frameworks for executing big-data batch applications over a cluster of (virtual) machines. Defining the cluster (i.e., the number of machines and CPUs) to attain guarantees on the execution times (deadlines) of the application is indeed a trade-off between the cost of the infrastructure and the time needed to execute the application. Sizing the computational resources, in order to prevent cost overruns, can benefit from the use of formal models as a means to capture the ex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
3
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…The work [16] presents a formal model for Spark applications based on temporal logic. The model takes into account the DAG that forms the program, information about the execution environment, such as the number of CPU cores available, the number of tasks of the program and the average execution time of the tasks.…”
Section: Related Workmentioning
confidence: 99%
“…The work [16] presents a formal model for Spark applications based on temporal logic. The model takes into account the DAG that forms the program, information about the execution environment, such as the number of CPU cores available, the number of tasks of the program and the average execution time of the tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Based on this specification, necessary and sufficient conditions are extracted to verify whether the outputs of aggregations in a Spark program are deterministic. The work[37] presents a formal model for Spark applications based on temporal logic. The model considers the DAG that forms the program, information about the execution environment, such as the number of CPU cores available, the number of program tasks, and the average execution time of the tasks.…”
mentioning
confidence: 99%
“…Then, the model is used to check time constraints and make predictions about the program's execution time. Both works ([11] and[37]) aim to evaluate Spark programs for specific properties. The abstraction level of our model is higher than that of the work of Marconi et al, so our model is not suited, as it is, to evaluate cluster behavior.…”
mentioning
confidence: 99%