2013 Joint IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS) 2013
DOI: 10.1109/ifsa-nafips.2013.6608598
|View full text |Cite
|
Sign up to set email alerts
|

Coreference detection in XML metadata

Abstract: Abstract-Preserving data quality is an important issue in data collection management. One of the crucial issues hereby is the detection of duplicate objects (called coreferent objects) which describe the same entity, but in different ways. In this paper we present a method for detecting coreferent objects in metadata, in particular in XML schemas. Our approach consists in comparing the paths from a root element to a given element in the schema. Each path precisely defines the context and location of a specific… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
2
2
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 11 publications
0
6
0
Order By: Relevance
“…It allows to increase the quality of the results of the proposed approach. These heuristics are novelties with respect to previous work in [51]. Moreover, additional related work has been added and in consequence the experimental section has been extended by reporting results of a comparative study with well-known approaches: Cupid, COMA 3.0, QMatch, HMAT and a method which is proposed by Lu et al Our goal was to provide a method that was general enough so that it can establish matching of two XML schemas (or the parts/paths of thereof) with limited information available (we assume that only schemas are given).…”
Section: Objectivementioning
confidence: 91%
See 1 more Smart Citation
“…It allows to increase the quality of the results of the proposed approach. These heuristics are novelties with respect to previous work in [51]. Moreover, additional related work has been added and in consequence the experimental section has been extended by reporting results of a comparative study with well-known approaches: Cupid, COMA 3.0, QMatch, HMAT and a method which is proposed by Lu et al Our goal was to provide a method that was general enough so that it can establish matching of two XML schemas (or the parts/paths of thereof) with limited information available (we assume that only schemas are given).…”
Section: Objectivementioning
confidence: 91%
“…The objective of this paper (an extension of [51]) is to propose an automatic, syntactical method for detecting coreferent elements in XML schemas based only on metadata and also, as a next step, a method for detecting coreference of XML schemas. More specifically, the detection of coreferent elements in XML schemas based only on comparison of elements names (tags) and their sequences (paths) is studied in the paper.…”
Section: Objectivementioning
confidence: 99%
“…al. [12] gives a method to detect coreferent object in XML metadata. Here coreferent object means duplicate objects of same real world object.…”
Section: Data (Fact and Figures) → Information (Internet) → Patterns mentioning
confidence: 99%
“…Two major steps are considered in the data integration process. The first step is known as the schema matching problem which attempts at reconciling structural heterogeneity of data by mapping schema elements across the data sources [1], [2], [3], [4], [5], [6], [7]. The second step resolves semantic heterogeneity of data by mapping data instances across the datasets and is known as the object mapping problem [8], [9], [10], [11], [12], [13], [14], [15], [16].…”
Section: Problem Statementmentioning
confidence: 99%
“…For instance, the concept establishment in Figure 1 is the most general concept for others values of attribute category of dataset S and has children corresponding to more specific structures (e.g., lodging, etc.). The second dataset contains objects extracted from the RouteYou dataset 2 , called the target T , also with a known partial order relation on the domain of the category attribute. Table II contains objects extracted from the target dataset, while a part of the order relation is presented in Figure 2.…”
Section: Problem Statementmentioning
confidence: 99%