Cataloging the Visible Universe Through Bayesian Inference at Petascale

Regier, Jeffrey; Pamnany, Kiran; Fischer, Keno; Noack, Andreas; Lam, Maximilian; Revels, Jarrett; Howard, Steve; Giordano, Ryan; Schlegel, David J.; McAuliffe, Jon; Thomas, R. C.; Prabhat,

doi:10.1109/ipdps.2018.00015

Cited by 12 publications

(8 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this case, Julia codes can be as performant as other codes written using multiple libraries and languages: the typical case uses Python for most of the code and some optimized library (Numba, Fortran codes wrapped using f2py) for the most performance-critical routines. As an application of this use case we mention the Celeste project, which was able to load and process 178 TB of data from the SDSS catalogue in 14.6 minutes across 8192 nodes (Regier et al 2018). • Existing codes are monolithic and difficult to use interactively, and the expense of rewriting code in Julia can be rewarded by the possibility to run the code interactively, either in Julia's command line or in Jupyter notebooks.…”

Section: Discussionmentioning

confidence: 99%

Towards new solutions for scientific computing: the case of Julia

Tomasi,

Giordano

2018

Preprint

View full text Add to dashboard Cite

This year marks the consolidation of Julia (https://julialang.org/), a programming language designed for scientific computing, as the first stable version (1.0) has been released, in August 2018. Among its main features, expressiveness and high execution speeds are the most prominent: the performance of Julia code is similar to statically compiled languages, yet Julia provides a nice interactive shell and fully supports Jupyter; moreover, it can transparently call external codes written in C, Fortran, and even Python and R without the need of wrappers. The usage of Julia in the astronomical community is growing, and a GitHub organization named JuliaAstro takes care of coordinating the development of packages. In this paper we present the features and shortcomings of this language, and discuss its application in astronomy and astrophysics.

show abstract

Section: Discussionmentioning

confidence: 99%

Towards new solutions for scientific computing: the case of Julia

Tomasi,

Giordano

2018

Preprint

View full text Add to dashboard Cite

show abstract

“…The runtime supports distributed parallel as well as multi-threaded execution. It has been demonstrated to perform at peta-scale on a high-performance computing platform [17], and it has strong support for scientific machine learning [18,19,20,21]. The language implementation is open source, available under the MIT license.…”

Section: The Julia Languagementioning

confidence: 99%

Performance of Julia for High Energy Physics Analyses

Stanitzki

Strube

2021

Comput Softw Big Sci

View full text Add to dashboard Cite

We argue that the Julia programming language is a compelling alternative to currently more common implementations in Python and C++ for common data analysis workflows in high energy physics. We compare the speed of implementations of different workflows in Julia with those in Python and C++. Our studies show that the Julia implementations are competitive for tasks that are dominated by computational load rather than data access. For work that is dominated by data access, we demonstrate an application with concurrent file reading and parallel data processing.

show abstract

“…L OOP parallelization is the de-facto standard method for performing shared-memory data-parallel computation. Parallel computing frameworks such as OpenMP [1] have enabled the acceleration of advances in many scientific and engineering fields such as astronomical physics [2], climate analytics [3], and machine learning [4]. A major challenge in enabling efficient loop parallelization is to deal with the inherent imbalance in workloads [5].…”

Section: Introductionmentioning

confidence: 99%

A Probabilistic Machine Learning Approach to Scheduling Parallel Loops with Bayesian Optimization

Kim,

Park

2022

Preprint

View full text Add to dashboard Cite

This paper proposes Bayesian optimization augmented factoring self-scheduling (BO FSS), a new parallel loop scheduling strategy. BO FSS is an automatic tuning variant of the factoring self-scheduling (FSS) algorithm and is based on Bayesian optimization (BO), a black-box optimization algorithm. Its core idea is to automatically tune the internal parameter of FSS by solving an optimization problem using BO. The tuning procedure only requires online execution time measurement of the target loop. In order to apply BO, we model the execution time using two Gaussian process (GP) probabilistic machine learning models. Notably, we propose a locality-aware GP model, which assumes that the temporal locality effect resembles an exponentially decreasing function. By accurately modeling the temporal locality effect, our locality-aware GP model accelerates the convergence of BO. We implemented BO FSS on the GCC implementation of the OpenMP standard and evaluated its performance against other scheduling algorithms. Also, to quantify our method's performance variation on different workloads, or workload-robustness in our terms, we measure the minimax regret. According to the minimax regret, BO FSS shows more consistent performance than other algorithms. Within the considered workloads, BO FSS improves the execution time of FSS by as much as 22% and 5% on average.

show abstract

Cataloging the Visible Universe Through Bayesian Inference at Petascale

Cited by 12 publications

References 14 publications

Towards new solutions for scientific computing: the case of Julia

Towards new solutions for scientific computing: the case of Julia

Performance of Julia for High Energy Physics Analyses

A Probabilistic Machine Learning Approach to Scheduling Parallel Loops with Bayesian Optimization

Contact Info

Product

Resources

About