Accelerating the discovery of advanced materials is essential for human welfare and sustainable, clean energy. In this paper, we introduce the Materials Project (www.materialsproject.org), a core program of the Materials Genome Initiative that uses high-throughput computing to uncover the properties of all known inorganic materials. This open dataset can be accessed through multiple channels for both interactive exploration and data mining. The Materials Project also seeks to create open-source platforms for developing robust, sophisticated materials analyses. Future efforts will enable users to perform ‘‘rapid-prototyping’’ of new materials in silico, and provide researchers with new avenues for cost-effective, data-driven materials design
We present the Python Materials Genomics (pymatgen) library, a robust, open-source Python library for materials analysis. A key enabler in highthroughput computational materials science efforts is a robust set of software tools to perform initial setup for the calculations (e.g., generation of structures and necessary input files) and post-calculation analysis to derive useful material properties from raw calculated data. The pymatgen library aims to meet these needs by (1) defining core Python objects for materials data representation, (2) providing a well-tested set of structure and thermodynamic analyses relevant to many applications, and (3) establishing an open platform for researchers to collaboratively develop sophisticated analyses of materials data obtained both from first principles calculations and experiments. The pymatgen library also provides convenient tools to obtain useful materials data via the Materials Project's REpresentational State Transfer (REST) Application Programming Interface (API). As an example, using pymatgen's interface to the Materials Project's REST API and phasediagram package, we demonstrate how the phase and electrochemical stability of a recently synthesized material, Li 4 SnS 4 , can be analyzed using a minimum of computing resources. We find that Li 4 SnS 4 is a stable phase in the LiSn-S phase diagram (consistent with the fact that it can be synthesized), but the narrow range of lithium chemical potentials for which it is predicted to be stable would suggest that it is not intrinsically stable against typical electrodes used in lithium-ion batteries.
Cloud computing has seen tremendous growth, particularly for commercial web applications. The on-demand, pay-as-you-go model creates a flexible and cost-effective means to access compute resources. For these reasons, the scientific computing community has shown increasing interest in exploring cloud computing. However, the underlying implementation and performance of clouds are very different from those at traditional supercomputing centers. It is therefore critical to evaluate the performance of HPC applications in today's cloud environments to understand the tradeoffs inherent in migrating to the cloud. This work represents the most comprehensive evaluation to date comparing conventional HPC platforms to Amazon EC2, using real applications representative of the workload at a typical supercomputing center. Overall results indicate that EC2 is six times slower than a typical mid-range Linux cluster, and twenty times slower than a modern HPC system. The interconnect on the EC2 cloud platform severely limits performance and causes significant variability.
In this paper, we describe the Materials Application Programming Interface (API), a simple, flexible and efficient interface to programmatically query and interact with the Materials Project database based on the REpresentational State Transfer (REST) pattern for the web. Since its creation in Aug 2012, the Materials API has been the Materials Project's de facto platform for data access, supporting not only the Materials Project's many collaborative efforts but also enabling new applications and analyses. We will highlight some of these analyses enabled by the Materials API, particularly those requiring consolidation of data on a large number of materials, such as data mining of structural and property trends, and generation of phase diagrams. We will conclude with a discussion of the role of the API in building a community that is developing novel applications and analyses based on Materials Project data.
. We thank K. Yelick, G. Karpen, and M. Maxon for valuable discussions, guidance, and support. We thank D. Skinner and the Outreach, Software and Programming Group at NERSC for their ongoing efforts and support to help deliver scientific data and high-performance computing to science communities. We thank R. E. Lance-Rubel for her patience, support, and advice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.