Towards High Performance Data Analytics for Climate Change

Fiore, Sandro; Elia, Donatello; Palazzo, Cosimo; Antonio, Fabrizio; D'Anca, Alessandro; Foster, Ian; Aloisio, Giovanni

doi:10.1007/978-3-030-34356-9_20

Cited by 3 publications

(5 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Consequently, given the proposed runtime system, it is best to rely on a higher number of threads rather than on MPI processes, that should only be exploited to scale over multiple nodes when it comes to larger scale scenarios. Overall, the proposed HPDA runtime system and deployment mechanisms have proven to scale effectively over a large number of threads and nodes in a supercomputing environment, overcoming by one order of magnitude the scalability limits that affected previous releases [92]. Moreover, they gave us better insight into the framework behaviour alongside its new runtime system, and also helped us identify aspects that need to be further improved and optimized in the future, thus bringing important feedback to the software roadmap.…”

Section: F Discussionmentioning

confidence: 93%

“…The undertaken tests focus on the execution of single operators in order to get a better understanding of the runtime system behaviour at the level of intra-task execution (i.e., the fragment-level parallelism). The benchmark moves beyond the scalability limits of a few hundreds of cores that have already been assessed in previous work [90], [92] with former versions of the framework. Moreover, initial experiments targeting inter-task behaviour at the level of the workflow (already performed on former versions of the framework as reported in [93]) will be carried out in future work after the full characterization of the framework scalability at the level of single operator.…”

Section: Experimental Evaluation and Resultsmentioning

confidence: 99%

“…Each fragment is composed of a set of multi-dimensional binary arrays following a data store implementation based on a NoSQL approach. A more detailed and rigorous description of the storage model is provided in [92].…”

Section: Thusmentioning

confidence: 99%

See 2 more Smart Citations

Towards HPC and Big Data Analytics Convergence: Design and Experimental Evaluation of a HPDA Framework for eScience at Scale

2021

Self Cite

View full text Add to dashboard Cite

Over the last two decades, scientific discovery has increasingly been driven by the large availability of data from a multitude of sources, including high-resolution simulations, observations and instruments, as well as an enormous network of sensors and edge components. In such a dynamic and growing landscape where data continue to expand, advances in Science have become intertwined with the capacity of analysis tools to effectively handle and extract valuable information from this ocean of data. In view of the exascale era of supercomputers that is rapidly approaching, it is of the utmost importance to design novel solutions that can take full advantage of the upcoming computing infrastructures. The convergence of High Performance Computing (HPC) and data-intensive analytics is key to delivering scalable High Performance Data Analytics (HPDA) solutions for scientific and engineering applications. The aim of this paper is threefold: reviewing some of the most relevant challenges towards HPDA at scale, presenting a HPDAenabled version of the Ophidia framework and validating the scalability of the proposed framework through an experimental performance evaluation carried out in the context of the Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE). The experimental results show that the proposed solution is capable of scaling over several thousand cores and hundreds of cluster nodes. The proposed work is a contribution in support of scientific large-scale applications along the wider convergence path of HPC and Big Data followed by the scientific research community.

show abstract

Section: F Discussionmentioning

confidence: 93%

Section: Experimental Evaluation and Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Towards HPC and Big Data Analytics Convergence: Design and Experimental Evaluation of a HPDA Framework for eScience at Scale

2021

Self Cite

View full text Add to dashboard Cite

show abstract

“…The cube objects store the metadata and a reference towards the actual data managed by the server-side components. Moreover, the class implements a set of methods for materializing these virtual datacubes into (client-side) Python objects containing the whole data (according to the Ophidia data model [18]). The cube class also provides methods for translating these objects into common scientific Python formats, such as Xarray Datasets [19] or Pandas Dataframes [20].…”

Section: Software Architecturementioning

confidence: 99%

“…This abstraction is particularly suited for scientific data as these are inherently multi-dimensional. The Ophidia data model (widely discussed in [18]) implements the datacube abstraction and strategies to split and partition data horizontally into fragments across multiple computing nodes on the server-side. Towards enabling scalable analytics, the server-side components provide parallel access and processing to the distributed fragments composing a datacube.…”

Section: Data Distribution and Execution Parallelismmentioning

confidence: 99%