2020
DOI: 10.1007/s41019-020-00146-w
|View full text |Cite
|
Sign up to set email alerts
|

Blocking Techniques for Entity Linkage: A Semantics-Based Approach

Abstract: Nowadays, data integration must often manage noisy data, also containing attribute values written in natural language such as product descriptions or book reviews. In the data integration process, Entity Linkage has the role of identifying records that contain information referring to the same object. Modern Entity Linkage methods, in order to reduce the dimension of the problem, partition the initial search space into “blocks” of records that can be considered similar according to some metrics, comparing then… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(3 citation statements)
references
References 22 publications
0
3
0
Order By: Relevance
“…(4) Diverse data processing and storage methods: based on their own business needs, each software manufacturer has great differences in data model, presentation content, and storage format. It also results in the characteristics of multisource data [15].…”
Section: E Concept and Characteristics Of Data Integrationmentioning
confidence: 99%
“…(4) Diverse data processing and storage methods: based on their own business needs, each software manufacturer has great differences in data model, presentation content, and storage format. It also results in the characteristics of multisource data [15].…”
Section: E Concept and Characteristics Of Data Integrationmentioning
confidence: 99%
“…In the process of constructing knowledge map, entity link is also very necessary. Azzalini et al [9] use deep learning to capture the semantic properties of data. Utilizing the subject-predicateobject triples to build knowledge attracts more attention.…”
Section: Introductionmentioning
confidence: 99%
“…As organizations ingest and process larger amounts of data, the time and effort it takes to prepare and integrate data into useful products are also increasing, and many researchers are working to alleviate this bottleneck using several different approaches [1], [2], [3]. The root cause of the time delay is human supervision of the curation steps including data quality analysis, data cleansing and standardization, entity resolution (ER), and data integration [4]. The goal of ER is to link two references if, and only if, the references are equivalent [5], [6].…”
Section: Introductionmentioning
confidence: 99%