2010
DOI: 10.1145/1806916.1806920
|View full text |Cite
|
Sign up to set email alerts
|

Exploring XML web collections with DescribeX

Abstract: As Web applications mature and evolve, the nature of the semistructured data that drives these applications also changes. An important trend is the need for increased flexibility in the structure of Web documents. Hence, applications cannot rely solely on schemas to provide the complex knowledge needed to visualize, use, query and manage documents. Even when XML Web documents are valid with regard to a schema, the actual structure of such documents may exhibit significant variations across collections for seve… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2010
2010
2020
2020

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 51 publications
0
7
0
Order By: Relevance
“…While the theoretical complexity of such queries is O(M k ), with M the number of edges and k the size of the most complex query involved, efficient graph query processors, with the help of good indexing, achieve much better performance in practice. [18] considers the summarization of large Web document collections. While the main structure in this case consists of trees, they may also feature reference edges which turn the dataset into a global graph.…”
Section: Quotient Graph Summariesmentioning
confidence: 99%
“…While the theoretical complexity of such queries is O(M k ), with M the number of edges and k the size of the most complex query involved, efficient graph query processors, with the help of good indexing, achieve much better performance in practice. [18] considers the summarization of large Web document collections. While the main structure in this case consists of trees, they may also feature reference edges which turn the dataset into a global graph.…”
Section: Quotient Graph Summariesmentioning
confidence: 99%
“…Yu and Jagadish [25] consider the problem of summarizing huge XML schemas for presentation to a user, but do not include the content of documents in their summaries. DescribeX [4] allows to create summaries of huge collections of XML documents, focusing on both schema and content. Huang et al [10] compute snippets of an XML document to return as result of a query.…”
Section: Related Workmentioning
confidence: 99%
“…A standard example of this model is the RDF 4 data format with its SPARQL 5 query language, where a query can be viewed as a sub-graph pattern that is matched with the knowledge base to produce the results. 6 4 http://www.w3.org/RDF 5 http://www.w3.org/TR/rdf-sparql-query 6 Formally, data in RDF consists of triples: subject-predicate-object, but this is mathematically equivalent to the multi-graph model described above Structured query languages for knowledge graphs such as SPARQL allow for semantic search and are very expressive, however they are quite complex for unexperienced users and they also assume some prior knowledge about the domain (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…Then, for every edge labeled p which goes from s to o in the graph, the summary has an edge labeled p from the summary node corresponding to the equivalence class of s, to the node corresponding to the equivalence class of p. Such summaries, also called quotient summaries, have many good properties, mainly due to the existence of a graph homomorphism from the original graph into its summary. Quotient summaries proposed for general graphs include [21,30,22,23,10,11].…”
Section: Introductionmentioning
confidence: 99%