Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data - SIGMOD '97 1997
DOI: 10.1145/253260.253402
|View full text |Cite
|
Sign up to set email alerts
|

The distributed information search component (Disco) and the World Wide Web

Abstract: The Distributed Information Search COmponent (DISCO) is a prototype heterogeneous distributed database that accesses underlying data sources. The DISCO prototype currently focuses on three central research problems in the context of these systems. First, since the capabilities of each data source is different, transforming queries into subqueries on data source is difficult. We call this problem the weak data source problem. Second, since each data source performs operations in a generally unique way, the cost… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
21
0
4

Year Published

1998
1998
2010
2010

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 38 publications
(25 citation statements)
references
References 7 publications
0
21
0
4
Order By: Relevance
“…In its basic motivation, our work is inspired by previous work in the integration of heterogeneous data sources, such as data sources on the Web [Levy et al 1996b;Arens et al 1996;Garcia-Molina et al 1995;Atzeni et al 1997;Tomasic et al 1997;Bayardo et al 1997]. None of these previous systems, however, include a "fuzzy" matching procedure for names; instead they construct global domains using hand-crafted domain-specific normalization schemes, or domain-specific matching algorithms [Fang et al 1994].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In its basic motivation, our work is inspired by previous work in the integration of heterogeneous data sources, such as data sources on the Web [Levy et al 1996b;Arens et al 1996;Garcia-Molina et al 1995;Atzeni et al 1997;Tomasic et al 1997;Bayardo et al 1997]. None of these previous systems, however, include a "fuzzy" matching procedure for names; instead they construct global domains using hand-crafted domain-specific normalization schemes, or domain-specific matching algorithms [Fang et al 1994].…”
Section: Related Workmentioning
confidence: 99%
“…Integration of distributed, heterogeneous databases, sometimes known as data integration, is an active area of research in the database community [Duschka and Genesereth 1997b;Levy et al 1996b;Arens et al 1996;Garcia-Molina et al 1995;Tomasic et al 1997;Bayardo et al 1997]. Largely inspired by the proliferation of database-like sources on the World Wide Web, previous researchers have addressed a diverse set of problems, ranging from access to "semi-structured" information sources [Suciu 1996;Abiteboul and Vianu 1997;Suciu 1997] to combining databases with differing schemata [Levy et al 1996a;Duschka and Genesereth 1997a].…”
Section: Introductionmentioning
confidence: 99%
“…The TSIMMIS project at Stanford addresses the problem of accessing non-standard data, notably semi-structured data, and proposes a flexible mediatorbased approach [3,8]. At INRIA, the Distributed Information Search Component (DISCO) has been developed [21,22]. However, all of these prototypes focus on heterogeneous query optimization and flexible data source integration using their proprietary middleware system.…”
Section: Related Workmentioning
confidence: 99%
“…Regardless of the number of cost dimensions, a centralized optimizer cannot accurately estimate the costs of operations at many autonomous sites. Garlic [23,40] and other middleware systems [24,46] address this problem by involving site-specific wrappers in the optimization process, but they do not consider the cost of communicating with these wrappers. This cost is not significant in these systems because the wrappers typically reside in the same address space as the optimizer.…”
Section: Decoupling Of Cost Estimationmentioning
confidence: 99%
“…The query optimization work goes back as far as the early distributed database systems (R*, SDD-1, Distributed Ingres [22,14,7]), and most recently has been focused on linking data sources of various capabilities and cost models [23,30,46]. However, query optimization in the broad federated environment presents peculiarities that change the trade-offs in the optimization process quite significantly.…”
Section: Introductionmentioning
confidence: 99%