Proceedings of the 2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers 2011
DOI: 10.1145/2132876.2132882
|View full text |Cite
|
Sign up to set email alerts
|

Parallel high-resolution climate data analysis using swift

Abstract: Advances in software parallelism and high-performance systems have resulted in an order of magnitude increase in the volume of output data produced by the Community Earth System Model (CESM). As the volume of data produced by CESM increases, the single-threaded script-based software packages traditionally used to post-process model output data have become a bottleneck in the analysis process. This paper presents a parallel version of the CESM atmosphere model data analysis workflow implemented using the Swift … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
3
3
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 10 publications
0
4
0
Order By: Relevance
“…Each package requires different types of climatologies and plot types which creates unique performance characteristics for each of the packages. While previous efforts have enabled parallelism in the diagnostic packages (Woitaszek et al, 2011;Jacob et al, 2012), this approach resulted in poor performance for multiple file operations, and it had a steep learning curve for users. In order to create the climatology files in parallel and to reduce the expensive disk I/O operations, we developed the tool PyAverager (Paul et al, 2015;Mickelson et al, 2018).…”
Section: Diagnosticsmentioning
confidence: 99%
“…Each package requires different types of climatologies and plot types which creates unique performance characteristics for each of the packages. While previous efforts have enabled parallelism in the diagnostic packages (Woitaszek et al, 2011;Jacob et al, 2012), this approach resulted in poor performance for multiple file operations, and it had a steep learning curve for users. In order to create the climatology files in parallel and to reduce the expensive disk I/O operations, we developed the tool PyAverager (Paul et al, 2015;Mickelson et al, 2018).…”
Section: Diagnosticsmentioning
confidence: 99%
“…package requires different types of climatologies and plot types which creates unique performance characteristics for each of the packages. While previous efforts have enabled parallelism in the workflow (Woitaszek et al, November 2011;Jacob et al, 2012), this work presented it own set of issues. Specifically this approach resulting in poor performance for multiple file operations, and it had a steep learning curve for users.…”
Section: Diagnosticsmentioning
confidence: 99%
“…Woitaszek et al [14] gained performance by parallelizing a post-processing workflow for climate data using the Swift scripting language, with parallelism only scaling to 32 processes. On the other hand, our work is focused primarily on I/O performance scaling to thousands of processes.…”
Section: Related Workmentioning
confidence: 99%
“…Also, each individual file will be spatially split into more partitions. This will increase both the ns and pf terms in Equation 14. Thus using spatial parallelism will result in worse file access patterns as the number of processes grows.…”
Section: Analysis Of Modelsmentioning
confidence: 99%