Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principle-driven methodologies to model complex chemical and materials processes. Over the past few decades, the rapid development of computing technologies and the tremendous increase in computational power have offered a unique chance to study complex transformations using sophisticated and predictive many-body techniques that describe correlated behavior of electrons in molecular and condensed phase systems at different levels of theory. In enabling these simulations, novel parallel algorithms have been able to take advantage of computational resources to address the polynomial scaling of electronic structure methods. In this paper, we briefly review the NWChem computational chemistry suite, including its history, design principles, parallel tools, current capabilities, outreach, and outlook.
Abstract. The polyhedral model provides powerful abstractions to optimize loop nests with regular accesses. Affine transformations in this model capture a complex sequence of execution-reordering loop transformations that can improve performance by parallelization as well as locality enhancement. Although a significant body of research has addressed affine scheduling and partitioning, the problem of automatically finding good affine transforms for communication-optimized coarsegrained parallelization together with locality optimization for the general case of arbitrarily-nested loop sequences remains a challenging problem.We propose an automatic transformation framework to optimize arbitrarilynested loop sequences with affine dependences for parallelism and locality simultaneously. The approach finds good tiling hyperplanes by embedding a powerful and versatile cost function into an Integer Linear Programming formulation. These tiling hyperplanes are used for communication-minimized coarse-grained parallelization as well as for locality optimization. The approach enables the minimization of inter-tile communication volume in the processor space, and minimization of reuse distances for local execution at each node. Programs requiring one-dimensional versus multi-dimensional time schedules (with scheduling-based approaches) are all handled with the same algorithm. Synchronization-free parallelism, permutable loops or pipelined parallelism at various levels can be detected. Preliminary studies of the framework show promising results.
Layered metal dichalcogenide materials are a family of semiconductors with a wide range of energy band gaps and properties, and the potential to create exciting new physics and technology applications. However, obtaining high crystal quality thin films over a large area remains a challenge. Here we show that chemical vapor deposition (CVD) can be used to achieve large area electronic grade single crystal Molybdenum Disulfide (MoS 2 ) thin films with the highest mobility reported in CVD grown films so far. Growth temperature and choice of substrate were found to critically impact the quality of film grown, and high temperature growth on (0001) orientated sapphire yielded highly oriented single crystal MoS 2 films for the first time. Films grown under optimal conditions were found to be of high structural quality from high-resolution X-ray diffraction, transmission electron microscopy, and Raman measurements, approaching the quality of reference geological MoS 2 . Photoluminescence and electrical measurements confirmed the growth of optically active MoS 2 with a low background carrier concentration, and high mobility. The CVD method reported here for the growth of high quality MoS 2 thin films paves the way towards growth of a variety of layered 2D chalcogenide semiconductors and their heterostructures.
Irregular and dynamic parallel applications pose significant challenges to achieving scalable performance on large-scale multicore clusters. These applications often require ongoing, dynamic load balancing in order to maintain efficiency. Scalable dynamic load balancing on large clusters is a challenging problem which can be addressed with distributed dynamic load balancing systems. Work stealing is a popular approach to distributed dynamic load balancing; however its performance on large-scale clusters is not well understood. Prior work on work stealing has largely focused on shared memory machines. In this work we investigate the design and scalability of work stealing on modern distributed memory systems. We demonstrate high efficiency and low overhead when scaling to 8,192 processors for three benchmark codes: a producer-consumer benchmark, the unbalanced tree search benchmark, and a multiresolution analysis kernel.
In this paper we outline the extension of recently introduced the sub-system embedding subalgebras coupled cluster (SES-CC) formalism to the unitary CC formalism. In analogy to the standard single-reference SES-CC formalism, its unitary CC extension allows one to include the dynamical (outside the active space) correlation effects in an SES induced complete active space (CAS) effective Hamiltonian. In contrast to the standard single-reference SES-CC theory, the unitary CC approach results in a Hermitian form of the effective Hamiltonian. Additionally, for the double unitary CC formalism (DUCC) the corresponding CAS eigenvalue problem provides a rigorous separation of external cluster amplitudes that describe dynamical correlation effects -used to define the effective Hamiltonian -from those corresponding to the internal (inside the active space) excitations that define the components of eigenvectors associated with the energy of the entire system. The proposed formalism can be viewed as an efficient way of downfolding many-electron Hamiltonian to the low-energy model represented by a particular choice of CAS. In principle, this technique can be extended to any type of complete active space representing an arbitrary energy window of a quantum system. The Hermitian character of low-dimensional effective Hamiltonians makes them an ideal target for several types of full configuration interaction (FCI) type eigensolvers. As an example, we also discuss the algebraic form of the perturbative expansions of the effective DUCC Hamiltonians corresponding to composite unitary CC theories and discuss possible algorithms for hybrid classical and quantum computing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.