The ongoing growth in computing power enables researchers to perform such a large number of simulations that cannot be analyzed with paper and pencil any more. Simple approaches of processing data: ordering the calculations in directories and using a script to create a spreadsheet or a small database have to be redesigned for every new project. Sharing intermediate data with collaborators can be cumbersome and when publishing on the Internet specially tailored infrastructure has to be set up.Due to the diverse and changing landscape of electronic structure codes and methods there is no unique way of storing, collecting and presenting results. However there are many partial solutions: VMDF (paper D) a tool to filter and analyze aggregated sets of electronic structure data presents a first step towards user-friendly analysis of data. The Inorganic Crystal Structure Database ICSD [1, 2], collects very specific data and makes it accessible through a web interface; AflowLib (Ab-initio Electronic Structure Library) [3] provides access to structure properties of many compounds on the Internet.What is missing is a system that is Open Source Software, generic enough to support different codes, different abstraction levels and enables users to analyze their own results, and allows to share data with collaborators.The approach of the Computational Materials Repository (CMR) is to convert data to an internal format that maintains the original variable names without insisting on any semantics. Imported data can be implicitly grouped by user criteria and therefore maintain their natural connection in the database as well. Automatic data analysis is enabled through agents that analyze and group data based on predefined rules. Small projects can be handled without the need of database software while bigger projects one can use to improve performance.ii CMR enables one to create templates for the collection and analysis of data independently of the electronic structure code, simplifies screenings involving a lot of calculations, allows one to perform automatic analysis of data based on taxonomy, tags and keywords, provides the ability to share data with collaborators and maintains the link from the derived to the original data. ResuméDen igangvaerende vaekst i computerkraft gør det muligt for forskere at udføre et såstort antal simuleringer, at det ikke laengere er muligt at analysere med papir og blyant. Enkle tilgange til behandling af data: samling af beregninger i mapper og brug af et script til at generere et regneark eller en lille database måredesignes for hvert nyt projekt. Deling af intermediaer data med samarbejdspartnere kan vaere besvaerligt og ved publikation påinternettet skal specifikt skraeddersyede infrastrukturer opsaettes.Grundet det mangeartede og foranderlige landskab af koder og metoder til elektronstruktur-beregninger findes ingen unik måde at gemme, samle og praesentere resultater på. Der findes imidlertid mange delvise løsninger: VMDF (paper D) er et vaerktøj til filtrering og analyse af aggreger...
In this paper we introduce the representative object, which uncovers the inherent schema (s)
tsimmis 1 OverviewIn order to access information from a variety of heterogeneous information sources, one has to be able to translate queries and data from one data model into another.This functionality is provided by so-called (source) wrappers [4,8] which convert queries into one or more commands/queries understandable by the underlying source and transform the native results into a format understood by the application. As part of the TSIMMISproject [1,6] we have developed hard-coded wrappers for a variety of sources (e.g., Sybase DBMS, W WW pages, etc.) including legacy systems (Folio). However, anyone who has built a wrapper before can attest that a lot of effort goos into developing and writing such a wrapper. In situations where it is important or desirable to gain access to new sources quicldy, this is a major drawback. Furthermore, we have also observed that only a relatively small part of the code deals with the specific access details of the source. The rest of the code is either common among wrappers or implements query and data transformation that could be expressed in a high level, declarative fashion.Based on these observations, we have developed a wrapper implementation toolkit [7] for quickly building wrappers. The toolkit contains a library for commonly used functions, such as for receiving queries from the application and packaging results. It also ' Permission to make digitellhard copy of part or all this work for personal or clacsroom use is granted without fee provided that contains a facility for translating queries into sourcespecific commands, and for translating results into a model useful to the application.The philosophy behind our "template-baaed" translation methodology is as follows. The wrapper implementor specifies a set of templates (rules) written in a high level declarative language that describe the queries accepted by the wrapper as well as the objects that it returns. If an application query matches a template, an implementorprovided action associated with the template is executed to rovide the native query for the underly-F ing source . When the source returns the result of the query, the wrapper transforms the answer which is represented in the data model of the source into a representation that is used by the application. Using this toolkit one can quicldy design a simple wrapper with a few templates that cover some of the desired functionality, probably the one that is most urgently needed. However, templates can be added gradually as more functionality is required later on.Another important use of wrappers is in extending the query capabilities of a source. For instance, some sources may not be capable of answering queries that have multiple predicates. In such cases, it is necessary to pose a native query to such a source using only predicates that the source is capable of handling. The rest of the predicates are automatically separated from the user query and form a jilter query.When the wrapper receives the results, a poet-processing engine applies the filter query, ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.