Keyword search in XML repositories is a powerful tool for interactive data exploration. Much work has recently been done on making XML search aware of relationship information embedded in XML document structure, but without a clear winner in all data and query scenarios. Furthermore, due to its imprecise nature, search results cannot easily be analyzed and summarized to gain more insights into the data. We address these shortcomings with SEDA: a system for Search, Exploration, Discovery, and Analysis of XML Data. SEDA is based on a paradigm of search and user interaction to help users start with simple keyword-style querying and perform rich analysis of XML data by leveraging both the content and structure of the data. SEDA is an interactive system that allows the user to refine her query iteratively to explore the XML data and discover interesting relationships. SEDA first employs a top-k algorithm to compute the most relevant top-k answers fast, and returns tuples of nodes ranked by relevance. SEDA provides several novel data structures and techniques for efficient top-k computation over graph-structured XML data. SEDA also computes all the contexts in which the query terms are found and all the connection paths that connect the query terms in the XML data. These two summaries enable the user to refine her query by disambiguating the contexts and connections relevant to her query. With the user feedback, the system has enough information to compute all query results, not just the top-k. From the complete results, SEDA automatically deduces a star schema, which is then instantiated with the query results and augmented with additional values required for a well-defined data cube. The tables computed at this step are input into an OLAP engine for further analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.