2017 13th International Computer Engineering Conference (ICENCO) 2017
DOI: 10.1109/icenco.2017.8289792
|View full text |Cite
|
Sign up to set email alerts
|

Record linkage approaches in big data: A state of art study

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
5
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 15 publications
0
5
0
Order By: Relevance
“…Using the CA in scenarios(2,3,4,5,7) achieved better results than a single WA (scenario 1) in terms of efficiency and effectiveness. As fig.4illustrates, the usage of CA (scenarios 2, 3, 4, 5, 7) in the different scenarios achieved better performance time compared to the WA (first scenario).…”
mentioning
confidence: 95%
See 2 more Smart Citations
“…Using the CA in scenarios(2,3,4,5,7) achieved better results than a single WA (scenario 1) in terms of efficiency and effectiveness. As fig.4illustrates, the usage of CA (scenarios 2, 3, 4, 5, 7) in the different scenarios achieved better performance time compared to the WA (first scenario).…”
mentioning
confidence: 95%
“…Blocking techniques depend on distributing the big dataset into several small blocks such that elements that reside in the same block are more likely to be matched. Many blocking techniques have been used in literature as clarified in [4].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Some research on algorithms that address the computational burden of the comparison and classification tasks in record linkage has been undertaken. Most work on distributed and parallel algorithms for record linkage is specific to the MapReduce paradigm [ 15 ], a programming model for processing large data sets in parallel on a cluster. Few sources detail the comparison and classification tasks themselves, with the focus on load balancing algorithms to address issues associated with data skew.…”
Section: Introductionmentioning
confidence: 99%
“…The blocking techniques used in these studies are based on the same techniques used for traditional probabilistic and deterministic linkages [ 15 ]. There are many blocking techniques used in these conventional approaches to record linkages that reduce the comparison space significantly, even when running a linkage on a single machine [ 26 ].…”
Section: Introductionmentioning
confidence: 99%