Pritish Jetley scite author profile

CHANGA is an N-body cosmology simulation application implemented using CHARM++. In this paper, we present the parallel design of CHANGA and address many challenges arising due to the high dynamic ranges of clustered datasets. We propose optimizations based on adaptive techniques. We evaluate the performance of CHANGA on highly clustered datasets: a z ∼ 0 snapshot of a 2 billion particle realization of a 25 Mpc volume, and a 52 million particle multi-resolution realization of a dwarf galaxy. For the 25 Mpc volume, we show strong scaling on up to 128K cores of Blue Waters. We also demonstrate scaling up to 128K cores of a multi-stepping run of the 2 billion particle simulation. While the scaling of the multi-stepping run is not as good as single stepping, the throughput at 128K cores is greater by a factor of 2. We also demonstrate strong scaling on up to 512K cores of Blue Waters for two large, uniform datasets with 12 and 24 billion particles.

show abstract

Massively parallel cosmological simulations with ChaNGa

Jetley

Gioachin

Mendes

et al. 2008

View full text Add to dashboard Cite

Cosmological simulators are an important component in the study of the formation of galaxies and large scale structures, and can help answer many important questions about the universe. Despite their utility, existing parallel simulators do not scale effectively on modern machines containing thousands of processors. In this paper we present ChaNGa, a recently released production simulator based on the Charm++ infrastructure. To achieve scalable performance, ChaNGa employs various optimizations that maximize the overlap between computation and communication. We present experimental results of ChaNGa simulations on machines with thousands of processors, including the IBM Blue Gene/L and the Cray XT3. The paper goes on to highlight efforts toward even more efficient and scalable cosmological simulations. In particular, novel load balancing schemes that base decisions on certain characteristics of tree-based particle codes are discussed. Further, the multistepping capabilities of ChaNGa are presented, as are solutions to the load imbalance that such multiphase simulations face. We outline key requirements for an effective practical implementation and conclude by discussing preliminary results from simulations run with our multiphase load balancer.

show abstract

Scaling Hierarchical N-body Simulations on GPU Clusters

Jetley

Wesolowski

Gioachin

et al. 2010

View full text Add to dashboard Cite

Abstract-This paper focuses on the use of GPGPU-based clusters for hierarchical N -body simulations. Whereas the behavior of these hierarchical methods has been studied in the past on CPU-based architectures, we investigate key performance issues in the context of clusters of GPUs. These include kernel organization and efficiency, the balance between tree traversal and force computation work, grain size selection through the tuning of offloaded work request sizes, and the reduction of sequential bottlenecks. The effects of various application parameters are studied and experiments done to quantify gains in performance. Our studies are carried out in the context of a production-quality parallel cosmological simulator called ChaNGa. We highlight the re-engineering of the application to make it more suitable for GPU-based environments. Finally, we present performance results from experiments on the NCSA Lincoln GPU cluster, including a note on GPU use in multistepped simulations.

show abstract

An Adaptive Framework for Large-Scale State Space Search

Sun

Zheng

Jetley

et al. 2011

View full text Add to dashboard Cite

Abstract-State space search problems abound in the artificial intelligence, planning and optimization literature. Solving such problems is generally NP-hard. Therefore, a brute-force approach to state space search must be employed. It is instructive to solve them on large parallel machines with significant computational power. However, writing efficient and scalable parallel programs has traditionally been a challenging undertaking. In this paper, we analyze several performance characteristics common to all parallel state space search applications. In particular, we focus on the issues of grain size, the prioritized execution of tasks and the balancing of load among processors in the system. We demonstrate the techniques that are used to scale such applications to large scale. We have incorporated these techniques into a general search engine framework that is designed to solve a broad class of state space search problems. We demonstrate the efficiency and scalability of our design using three example applications, and present scaling results up to 16,384 processors.

show abstract

Architectural Constraints to Attain 1 Exaflop/s for Three Scientific Application Classes

Bhatelé

Jetley

Gahvari

et al. 2011

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Pritish Jetley

Adaptive techniques for clustered N-body cosmological simulations

Massively parallel cosmological simulations with ChaNGa

Scaling Hierarchical N-body Simulations on GPU Clusters

An Adaptive Framework for Large-Scale State Space Search

Architectural Constraints to Attain 1 Exaflop/s for Three Scientific Application Classes

Contact Info

Product

Resources

About