Biomolecular simulations, which were once batch queue or compute limited, have now become data analysis and management limited. In this paper we introduce a new management system for large biomolecular simulation and computational chemistry data sets. The system can be easily deployed on distributed servers to create a mini-grid at the researcher's site. The system not only offers a simple data deposition mechanism but also a way to register data into the system without moving the data from their original location. Any registered data set can be searched and downloaded using a set of defined metadata for molecular dynamics and quantum mechanics and visualized through a dynamic Web interface.
The official evaluation of Textractor for the i2b2 medication extraction challenge demonstrated satisfactory performance. This system was among the 10 best performing systems in this challenge.
BackgroundFew environments have been developed or deployed to widely share biomolecular simulation data or to enable collaborative networks to facilitate data exploration and reuse. As the amount and complexity of data generated by these simulations is dramatically increasing and the methods are being more widely applied, the need for new tools to manage and share this data has become obvious. In this paper we present the results of a process aimed at assessing the needs of the community for data representation standards to guide the implementation of future repositories for biomolecular simulations.ResultsWe introduce a list of common data elements, inspired by previous work, and updated according to feedback from the community collected through a survey and personal interviews. These data elements integrate the concepts for multiple types of computational methods, including quantum chemistry and molecular dynamics. The identified core data elements were organized into a logical model to guide the design of new databases and application programming interfaces. Finally a set of dictionaries was implemented to be used via SQL queries or locally via a Java API built upon the Apache Lucene text-search engine.ConclusionsThe model and its associated dictionaries provide a simple yet rich representation of the concepts related to biomolecular simulations, which should guide future developments of repositories and more complex terminologies and ontologies. The model still remains extensible through the decomposition of virtual experiments into tasks and parameter sets, and via the use of extended attributes. The benefits of a common logical model for biomolecular simulations was illustrated through various use cases, including data storage, indexing, and presentation. All the models and dictionaries introduced in this paper are available for download at http://ibiomes.chpc.utah.edu/mediawiki/index.php/Downloads.
As the amount of data generated by biomolecular simulations dramatically increases, new tools need to be developed to help manage this data at the individual investigator or small research group level. In this paper, we introduce iBIOMES Lite, a lightweight tool for biomolecular simulation data indexing and summarization. The main goal of iBIOMES Lite is to provide a simple interface to summarize computational experiments in a setting where the user might have limited privileges and limited access to IT resources. A command-line interface allows the user to summarize, publish, and search local simulation data sets. Published data sets are accessible via static hypertext markup language (HTML) pages that summarize the simulation protocols and also display data analysis graphically. The publication process is customized via extensible markup language (XML) descriptors while the HTML summary template is customized through extensible stylesheet language (XSL). iBIOMES Lite was tested on different platforms and at several national computing centers using various data sets generated through classical and quantum molecular dynamics, quantum chemistry, and QM/MM. The associated parsers currently support AMBER, GROMACS, Gaussian, and NWChem data set publication. The code is available at .
Biomolecular simulations aim to simulate structure, dynamics, interactions, and energetics of complex biomolecular systems. With the recent advances in hardware, it is now possible to use more complex and accurate models, but also reach time scales that are biologically significant. Molecular simulations have become a standard tool for toxicology and pharmacology research, but organizing and sharing data – both within the same organization and among different ones – remains a substantial challenge. In this paper we review our recent work leading to the development of a comprehensive informatics infrastructure to facilitate the organization and exchange of biomolecular simulations data. Our efforts include the design of data models and dictionary tools that allow the standardization of the metadata used to describe the biomedical simulations, the development of a thesaurus and ontology for computational reasoning when searching for biomolecular simulations in distributed environments, and the development of systems based on these models to manage and share the data at a large scale (iBIOMES), and within smaller groups of researchers at laboratory scale (iBIOMES Lite), that take advantage of the standardization of the meta data used to describe biomolecular simulations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.