2020
DOI: 10.3233/sw-190356
|View full text |Cite
|
Sign up to set email alerts
|

Learning expressive linkage rules from sparse data

Abstract: A central problem in the context of the Web of Data, as well as in data integration in general is to identify entities in different data sources that describe the same real-world object. There exists a large body of research on entity resolution. Interestingly, most of the existing research focuses on entity resolution on dense data, meaning data that does not contain too many missing values. This paper sets a different focus and explores learning expressive linkage rules from as well as applying these rules t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(9 citation statements)
references
References 46 publications
0
9
0
Order By: Relevance
“…Similar to product classification, typically product linking will make use of product names (e.g., Kannan et al (2011); Gopalakrishnan et al (2012);Vandic et al (2012);van Bezu et al (2015); Shah et al (2018);Tracz et al (2020); Li et al (2020)) and descriptions (e.g., Petrovski et al (2014); Ristoski et al (2018); Li et al (2020)). The difference however, is that the task also makes use of a diverse range of structured product attributes (e.g., van Bezu et al (2015); Shah et al (2018); Petrovski and Bizer (2020); Li et al (2020)), often defined as 'key-value' pairs such as those that can be extracted from product specifications (e.g., product ID, model, brand, manufacturer). Intuitively, offers that have the similar sets of key-value pairs are more likely to match.…”
Section: Product Linkingmentioning
confidence: 99%
See 1 more Smart Citation
“…Similar to product classification, typically product linking will make use of product names (e.g., Kannan et al (2011); Gopalakrishnan et al (2012);Vandic et al (2012);van Bezu et al (2015); Shah et al (2018);Tracz et al (2020); Li et al (2020)) and descriptions (e.g., Petrovski et al (2014); Ristoski et al (2018); Li et al (2020)). The difference however, is that the task also makes use of a diverse range of structured product attributes (e.g., van Bezu et al (2015); Shah et al (2018); Petrovski and Bizer (2020); Li et al (2020)), often defined as 'key-value' pairs such as those that can be extracted from product specifications (e.g., product ID, model, brand, manufacturer). Intuitively, offers that have the similar sets of key-value pairs are more likely to match.…”
Section: Product Linkingmentioning
confidence: 99%
“…Algorithms. Since the prediction of linking/matching of product offers depends on a notion of 'similarity', some methods will have an 'intermediary' step that converts product metadata features to similarity features (Vandic et al (2012); Li et al (2020); Petrovski and Bizer (2020)). This is typically done by applying similarity metrics -usually based on string form, or word/character distribution -to the textual feature representations of two offers.…”
Section: Product Linkingmentioning
confidence: 99%
“…The coverage of models learned with different combinations of attributes can significantly vary as not all attributes contribute equally to the solution of the matching task [21]. Discovering the set of attributes that encode the most-identifying information, is crucial for the extraction of more focused profiling meta-information.…”
Section: Relevant Attributesmentioning
confidence: 99%
“…Under this group fall the following tasks: phones, headphones, and tvs. The matching methods used for evaluating these tasks need to especially address the challenge of low data density [21]. Group 3: Small and Difficult.…”
Section: Profiling and Grouping The Matching Tasksmentioning
confidence: 99%
“…The data linking problem in data graphs has been the main focus of numerous studies (see [9,14] for survey), and applied in different research fields such as knowledge extraction [23,24], geospatial analysis [27], sentiment analysis [19,10], etc. Some of the existing approaches are based on expressive linking rules that can be learned from a set of existing reference links [18,16]. These rules consist of attribute-specific comparisons, aggregation functions along with different weights and thresholds.…”
Section: Related Workmentioning
confidence: 99%