Today's computational, experimental, and observational sciences rely on computations that involve many related tasks. The success of a scientific mission often hinges on the computer automation of these workflows. In April 2015, the US Department of Energy (DOE) invited a diverse group of domain and computer scientists from national laboratories supported by the Office of Science, the National Nuclear Security Administration, from industry, and from academia to review the workflow requirements of DOE's science and national security missions, to assess the current state of the art in science workflows, to understand the impact of emerging extreme-scale computing systems on those workflows, and to develop requirements for automated workflow management in future and existing environments. This article is a summary of the opinions of over 50 leading researchers attending this workshop. We highlight use cases, computing systems, workflow needs and conclude by summarizing the remaining challenges this community sees that inhibit large-scale scientific workflows from becoming a mainstream tool for extreme-scale science.
Mass Spectrometric Imaging (MSI) allows the generation of 2D ion density maps that help visualize molecules present in sections of tissues and cells. The combination of spatial resolution and mass resolution results in very large and complex data sets. New capabilities are necessary for efficient analysis and interpretation of this data. This work details the development and application of the capability to process, visualize, query, and analyze spatial mass spectrometry data. Applications include the generation of 2D maps for selected spectra, the manipulation of the heat maps, and the identification of spectral peaks. Heat maps are generated by projecting the sum of intensity vs. time spectra of each pixel for selected m/z value or range. These capabilities take the form of a new interactive software toolkit, MSI QuickView. This software approach is a significant advance over the previous state-of-the art methods that required the conversion of the RAW data using one software, manual assembly of the data, and visualization in another software.
This paper describes a prototype grid infrastructure, called the "eMinerals minigrid", for molecular simulation scientists. which is based on an integration of shared compute and data resources. We describe the key components, namely the use of Condor pools, Linux/Unix clusters with PBS and IBM's LoadLeveller job handling tools, the use of Globus for security handling, the use of Condor-G tools for wrapping globus job submit commands, Condor's DAGman tool for handling workflow, the Storage Resource Broker for handling data, and the CCLRC dataportal and associated tools for both archiving data with metadata and making data available to other workers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.