2019
DOI: 10.1007/978-3-030-34356-9_20
|View full text |Cite
|
Sign up to set email alerts
|

Towards High Performance Data Analytics for Climate Change

Abstract: The continuous increase in the data produced by simulations, experiments and edge components in the last few years has forced a shift in the scientific research process, leading to the definition of a fourth paradigm in Science, concerning data-intensive computing. This data deluge, in fact, introduces various challenges related to big data volumes, formats heterogeneity and the speed in the data production and gathering that must be handled to effectively support scientific discovery. To this end, High Perfor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 22 publications
0
5
0
Order By: Relevance
“…Consequently, given the proposed runtime system, it is best to rely on a higher number of threads rather than on MPI processes, that should only be exploited to scale over multiple nodes when it comes to larger scale scenarios. Overall, the proposed HPDA runtime system and deployment mechanisms have proven to scale effectively over a large number of threads and nodes in a supercomputing environment, overcoming by one order of magnitude the scalability limits that affected previous releases [92]. Moreover, they gave us better insight into the framework behaviour alongside its new runtime system, and also helped us identify aspects that need to be further improved and optimized in the future, thus bringing important feedback to the software roadmap.…”
Section: F Discussionmentioning
confidence: 93%
See 2 more Smart Citations
“…Consequently, given the proposed runtime system, it is best to rely on a higher number of threads rather than on MPI processes, that should only be exploited to scale over multiple nodes when it comes to larger scale scenarios. Overall, the proposed HPDA runtime system and deployment mechanisms have proven to scale effectively over a large number of threads and nodes in a supercomputing environment, overcoming by one order of magnitude the scalability limits that affected previous releases [92]. Moreover, they gave us better insight into the framework behaviour alongside its new runtime system, and also helped us identify aspects that need to be further improved and optimized in the future, thus bringing important feedback to the software roadmap.…”
Section: F Discussionmentioning
confidence: 93%
“…The undertaken tests focus on the execution of single operators in order to get a better understanding of the runtime system behaviour at the level of intra-task execution (i.e., the fragment-level parallelism). The benchmark moves beyond the scalability limits of a few hundreds of cores that have already been assessed in previous work [90], [92] with former versions of the framework. Moreover, initial experiments targeting inter-task behaviour at the level of the workflow (already performed on former versions of the framework as reported in [93]) will be carried out in future work after the full characterization of the framework scalability at the level of single operator.…”
Section: Experimental Evaluation and Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The cube objects store the metadata and a reference towards the actual data managed by the server-side components. Moreover, the class implements a set of methods for materializing these virtual datacubes into (client-side) Python objects containing the whole data (according to the Ophidia data model [18]). The cube class also provides methods for translating these objects into common scientific Python formats, such as Xarray Datasets [19] or Pandas Dataframes [20].…”
Section: Software Architecturementioning
confidence: 99%
“…This abstraction is particularly suited for scientific data as these are inherently multi-dimensional. The Ophidia data model (widely discussed in [18]) implements the datacube abstraction and strategies to split and partition data horizontally into fragments across multiple computing nodes on the server-side. Towards enabling scalable analytics, the server-side components provide parallel access and processing to the distributed fragments composing a datacube.…”
Section: Data Distribution and Execution Parallelismmentioning
confidence: 99%