2019 IEEE International Conference on Big Data (Big Data) 2019
DOI: 10.1109/bigdata47090.2019.9006348
|View full text |Cite
|
Sign up to set email alerts
|

BIGMAT: A Distributed Affinity-Preserving Random Walk Strategy for Instance Matching on Knowledge Graphs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 20 publications
0
12
0
Order By: Relevance
“…BIGMat is an IM approach implemented in Spark and based on an affinity‐preserving random walk technique. It represents the IM problem as a graph‐based node ranking and selection problem in a constructed candidates association graph.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…BIGMat is an IM approach implemented in Spark and based on an affinity‐preserving random walk technique. It represents the IM problem as a graph‐based node ranking and selection problem in a constructed candidates association graph.…”
Section: Discussionmentioning
confidence: 99%
“…For this purpose, a bipartite graph could first be built from the inferred identity links. Then, a post‐processing, including stable marriage problem (SMP), Hungarian algorithm, and symmetric best match strategy (SBM)) could be adopted to obtain the final set of co‐referent pairs from this bipartite graph as in BIGMat …”
Section: Main Lessons and Open Challengesmentioning
confidence: 99%
“…MSBlockSlicer [ 58 ] pays attention to the problem of load imbalance and adopts a block slice strategy to balance the load of each worker in the distributed cluster. The BIGMAT framework [ 22 ] applies the affinity-preserving random walk algorithm to express IM as a graph-based node ranking and selection problem in the constructed candidate association graph and selects matching results through a distributed architecture. Our framework leverages the proposed blocking algorithm to divide the matching task into multiple logistic regression tasks that can be executed distributionally.…”
Section: Related Workmentioning
confidence: 99%
“…The distributed computing model, such as MapReduce [ 21 ], allows the instance matching process to be divided into multiple matching tasks that can be executed by multiple workers. Frameworks that adopt this include LINDA [ 14 ], BIGMAT [ 22 ], etc.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation