Learning to Characterize Matching Experts

Shraga, Roee; Amir, Ofra; Gal, Avigdor

doi:10.1109/icde51399.2021.00111

Cited by 6 publications

(8 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(2) Majority decision and the user decision (Figure 2) are the most significant features. (3) Different from the observations in the setting of open-ended crowdsourced matching [3,44,45], decision times are less significant in boolean crowdsourcing.…”

Section: Empirical Evaluationmentioning

confidence: 91%

“…Recently, Ackerman et al detected human cognitive biases affecting matching quality [3]. These biases were used in the scope of open-ended crowdsourcing [36] for the task of improving human schema matching [44,45]. The current work focuses on boolean crowdsourcing aiming to address multiple human matching tasks.…”

Section: Related Workmentioning

confidence: 99%

“…Consequently, human-in-the-loop solutions [14,26,27,29,32,34,37,41,58], such as crowdsourcing, were developed to enable cheaper acquisition of labels from a large number of annotators in a short time. However, the increasing availability of human-labeled data may also yield variation in the reliability and proficiency of these human annotators [3,12,39,44,46,56], which, in turn, impair the quality of the generated labels. This variation may arise due to different expertise levels [12,46,56], mood changes [43], various cognitive biases [3,44] and more.…”

Section: Introductionmentioning

confidence: 99%

“…However, the increasing availability of human-labeled data may also yield variation in the reliability and proficiency of these human annotators [3,12,39,44,46,56], which, in turn, impair the quality of the generated labels. This variation may arise due to different expertise levels [12,46,56], mood changes [43], various cognitive biases [3,44] and more.…”

Section: Introductionmentioning

confidence: 99%

“…Specifically, we perform an empirical study with the structurally rich database task of schema matching (SM) [5,38], the structured task of entity matching (EM) [7,13,16] and the textually rich NLP task of labeling semantically equivalent sentences (text matching, TM) [19,35]. SM is characterized as a very challenging task [44] and was shown in literature to require expert skills [12,44,56]. EM usually depends on context and may be confusing for humans (e.g., what it means to be a match) [10,11].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

HumanAL: Calibrating Human Matching Beyond a Single Task

Shraga¹

2022

Preprint

Self Cite

View full text Add to dashboard Cite

This work offers a novel view on the use of human input as labels, acknowledging that humans may err. We build a behavioral profile for human annotators which is used as a feature representation of the provided input. We show that by utilizing black-box machine learning, we can take into account human behavior and calibrate their input to improve the labeling quality. To support our claims and provide a proof-of-concept, we experiment with three different matching tasks, namely, schema matching, entity matching and text matching. Our empirical evaluation suggests that the method can improve the quality of gathered labels in multiple settings including cross-domain (across different matching tasks).

show abstract

Section: Empirical Evaluationmentioning

confidence: 91%