Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2015
DOI: 10.1145/2783258.2783353
|View full text |Cite
|
Sign up to set email alerts
|

Entity Matching across Heterogeneous Sources

Abstract: Given an entity in a source domain, finding its matched entities from another (target) domain is an important task in many applications. Traditionally, the problem was usually addressed by first extracting major keywords corresponding to the source entity and then query relevant entities from the target domain using those keywords. However, the method would inevitably fails if the two domains have less or no overlapping in the content. An extreme case is that the source domain is in English and the target doma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 30 publications
(13 citation statements)
references
References 24 publications
0
13
0
Order By: Relevance
“…The challenge is that different sources may use different languages or terminologies to describe the same topic. A probabilistic model was proposed to integrate the topic extraction and matching into a unified model (Yang et al 2015). As we don't have access to the full text of the reference datasets, this method is not applicable to our problem.…”
Section: Related Workmentioning
confidence: 99%
“…The challenge is that different sources may use different languages or terminologies to describe the same topic. A probabilistic model was proposed to integrate the topic extraction and matching into a unified model (Yang et al 2015). As we don't have access to the full text of the reference datasets, this method is not applicable to our problem.…”
Section: Related Workmentioning
confidence: 99%
“…However, the approaches suffer from the requirement of heavy human efforts, and is thus costly and labor-expensive. Many works leverage the extra resources, such as OWL properties [9], entity descriptions [37], information of entities and relations [16,22,27]. Such methods are complex and usually limited by the availability of the extra information about a knowledge graph.…”
Section: Entity Alignmentmentioning
confidence: 99%
“…This process is called "Duplicate Records Detection" or "Entity Matching", and due to the duplicates; inconsistencies appear, so this process also called "Inconsistencies Detection". In this process; duplicates are detected [5,6,7,8,9,10] in preparation to remove the ambiguities in the generated answers and to fuse inconsistencies before passing the answers to the user. Detected duplicates are marked in the answer set, and passed as an input to the successive process to resolve the inconsistencies.…”
Section: Inconsistency Detection Processmentioning
confidence: 99%