2019
DOI: 10.1007/978-3-030-28730-6_5
|View full text |Cite
|
Sign up to set email alerts
|

Heterogeneous Committee-Based Active Learning for Entity Resolution (HeALER)

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
9
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 19 publications
0
9
0
Order By: Relevance
“…Meduri et al [32] provide an in-depth comparison of various matchers and example selectors but neither consider TPLMs nor address how to learn a blocker. While HEALER [10] attempts to improve upon Mozafari et al [33]'s committee-based approach to ER by including different kinds of matchers, it does not consider neural networks or TPLMs in its heterogeneous committee. Alongside AL for ER, another line of work attempts to solve ER by crowd-sourcing [53].…”
Section: Active Learning For Entity Resolutionmentioning
confidence: 99%
See 2 more Smart Citations
“…Meduri et al [32] provide an in-depth comparison of various matchers and example selectors but neither consider TPLMs nor address how to learn a blocker. While HEALER [10] attempts to improve upon Mozafari et al [33]'s committee-based approach to ER by including different kinds of matchers, it does not consider neural networks or TPLMs in its heterogeneous committee. Alongside AL for ER, another line of work attempts to solve ER by crowd-sourcing [53].…”
Section: Active Learning For Entity Resolutionmentioning
confidence: 99%
“…Popular choices for matcher includes support vector machines [45], random forests [32], and neural networks [23]. Popular choices for selector includes query-by-committee [16,49] which has seen wide usage in ER [10,33] and uncertainty sampling [23].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In each active learning iteration, one or more informative record pairs are selected and provided to a human annotator for labeling. One way to measure the degree of informativeness is to calculate the disagreement among the predictions of a classifier ensemble known as the Query-by-Committee strategy [2,19,21]. The most informative record pairs are the ones that cause the highest disagreement among the members of the committee.…”
Section: Introductionmentioning
confidence: 99%
“…To circumvent the cold start problem, existing active learning methods for entity resolution require the human annotator to label a small set of record pairs. This set is either selected randomly [9,18] or based on the distribution of pre-calculated similarity scores [2,19] before the active learning starts. However, solving the cold start problem by manually annotating a subset of the data is contradicting the main principle of active learning, that of minimizing the human labeling effort.…”
Section: Introductionmentioning
confidence: 99%