Aggregating Web Services with Active Invocation and Ensembles of String Distance Metrics

Johnston, Eddie; Kushmerick, Nicholas

doi:10.1007/978-3-540-30202-5_26

Cited by 2 publications

(2 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In unsupervised approach (Kang & Naughton, 2003;Pantel, Philpot, & Hovy, 2005), column matching is based on column-wise similarity scores such as those given by mutual information. Johnston and Kushmerick (2004) express the problem of aggregating data from Web services as a schema-matching problem and introduce the OATS system that uses ensembles of distance metrics to match instance data. In other words, OATS chooses an appropriate distance metric based on whether the field is numeric or string, and can learn the appropriate distance metrics from the data.…”

Section: Figure 8 Results Of Semantically Mapping Input and Output Pa...mentioning

confidence: 99%

“…Johnston and Kushmerick [12] express the problem of aggregating data from Web services as a schema matching problem and introduce the OATS system that uses ensembles of distance metrics to match instance data. In other words, OATS chooses an appropriate distance metric based on whether the field is numeric or string, and can learn the appropriate distance metrics from the data.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Semantic Labeling of Online Information Sources

Lerman

Plangprasopchock

Knoblock

2007

International Journal on Semantic Web and Information Systems

View full text Add to dashboard Cite

Software agents need to combine information from a variety of heterogeneous sources in the course of automating a task, such as planning a trip. In order to be able to use a source, an agent must have a model of it, i.e., understand the semantics of the input and output data it uses, as well as its functionality. Currently, source modeling is done by the user, but as large numbers of sources come online, it is impractical to expect the user to manually model them. To address this problem, it has been proposed that service providers use common ontologies. However, it appears to be equally impractical to expect service providers to conform to a standard, as there is very little incentive for them to do so. Instead, we propose to automatically learn the semantics of information sources by labeling the input and output parameters used by the source with semantic types of the user's domain model. We describe two machine learning techniques for semantic labeling: one that uses source's metadata, such as that contained in a Web Service Definition file, and one that uses the source's content to classify the semantic types it uses. We go beyond previous works by verifying the classifier's predictions by invoking the source with some sample data of the predicted type. We provide performance results of both classification methods and validate our approach on several live Web source -both Web services and HTML-form based sources. We also describe application of the semantic mapping technology within the CALO project.

show abstract

Section: Figure 8 Results Of Semantically Mapping Input and Output Pa...mentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%