Proceedings of the 28th International Conference on Advances in Geographic Information Systems 2020
DOI: 10.1145/3397536.3422346
|View full text |Cite
|
Sign up to set email alerts
|

STARE-based Integrative Analysis of Diverse Data Using Dask Parallel Programming Demo Paper

Abstract: Scaling up volume and variety in Big Earth Science Data is particularly difficult when combining low-level, ungridded data, such as swath observations obtained with, for example, Moderate Resolution Imaging Spectroradiometers (MODIS). A unified way to index and combine data with different geo-spatiotemporal layouts and incomparable native array formatting is required for scalable integrative analyses based on data at its full instrument resolution, that is, without extra interpolation (or extrapolation) onto a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 6 publications
0
4
0
Order By: Relevance
“…Examples of using Dask to distribute computations can be found in Earth and Climate sciences [13,14]. Also in those fields datasets contain many years of data and can reach sizes of multiple TBs, with non-trivial multidimensional schemas similar to what can be found in HEP.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Examples of using Dask to distribute computations can be found in Earth and Climate sciences [13,14]. Also in those fields datasets contain many years of data and can reach sizes of multiple TBs, with non-trivial multidimensional schemas similar to what can be found in HEP.…”
Section: Related Workmentioning
confidence: 99%
“…Also in those fields datasets contain many years of data and can reach sizes of multiple TBs, with non-trivial multidimensional schemas similar to what can be found in HEP. In one of the cited works, data processing through Dask allows parallelising part of the data analysis workflow, the scalability results are shown up to 32 cores with a speedup value of around 9 in the best cases [13]. The same community made an effort to develop an ecosystem for distributed data analysis, recognising that previous approaches led to fragmentation and unproductivity [15].…”
Section: Related Workmentioning
confidence: 99%
“…Around 2015, the Dask project was started, giving the possibility to parallelise the existing Python data science software packages in a familiar way for users [76]. Examples of using Dask to distribute computations can be found in Earth and Climate sciences [108,109]. Also in those fields datasets contain many years of data and can reach sizes of multiple TBs, with nontrivial multidimensional schemas similar to what can be found in HEP.…”
Section: State Of the Artmentioning
confidence: 99%
“…Also in those fields datasets contain many years of data and can reach sizes of multiple TBs, with nontrivial multidimensional schemas similar to what can be found in HEP. In one of the cited works, data processing through Dask allows parallelising part of the data analysis workflow, the scalability results are shown up to 32 cores with a speedup value of around 9 in the best cases [108]. The same community made an effort to develop an ecosystem for distributed data analysis, recognising that previous approaches led to fragmentation and unproductivity [110].…”
Section: State Of the Artmentioning
confidence: 99%