Management of large data repositories integrated into the Grid poses new challenges for Grid research.There already exist several successful Data Grid projects addressing processing files storing large volumes of scientific data and projects developing services for accessing remote relational and XML databases. However, so far, no effort was devoted to On-Line Analytical Processing (OLAP), an essential support for modern decision support systems. In this paper, we present our approach to the design and implementation of a Grid-enabled OLAP server, which is one functional building block of the GridMiner system, a novel infrastructure for knowledge discovery in Grid databases. We present the global architecture model of our solution, describe how the OLAP components are integrated into the GridMiner system, and present the software architecture of our first prototype. The OLAP components were implemented in Java on top of the software toolkit Globus 3.
This work intends to provide a data management solution based on the concepts of dataspaces for the large-scale and long-term management of scientific data. Our approach is to semantically enrich the existing relationship among primary and derived data items, and to preserve both relationships and data together within a dataspace to be reused by owners and others. To enable reuse, data must be well preserved. Preservation of scientific data can best be established if the full life cycle of data is addressed. This is challenged by the e-Science life cycle ontology, whose major goal is to trace the semantics about procedures in scientific experiments. We present a theoretical dataspace model for e-Science applications, its implementation within a dataspace support platform and an experimental evaluation on top of two real world application domains.
SUMMARYCase study research excels at bringing us to an understanding of a complex issue or object and can extend experience or add strength to what is already known through previous research. The research work summarized by this paper discusses two different case studies in the field of portals for collaborative research communities, in particular VectorBase and BGA-Space. VectorBase at its core is a scientific database that focuses on search, data mining and offers multiple integrated bioinformatics tools for analyzing and browsing genomic and related data. BGA-Space focuses on capturing semantics from scientists during processing of scientific experiments as well as preserving the full life cycle of scientific data to enable their reuse. The two case studies involve heavy research and the application of theories, concepts, and knowledge commonly discussed in the targeted field.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.