2020
DOI: 10.1111/cgf.13973
|View full text |Cite
|
Sign up to set email alerts
|

Classifier‐Guided Visual Correction of Noisy Labels for Image Classification Tasks

Abstract: Training data plays an essential role in modern applications of machine learning. However, gathering labeled training data is time‐consuming. Therefore, labeling is often outsourced to less experienced users, or completely automated. This can introduce errors, which compromise valuable training data, and lead to suboptimal training results. We thus propose a novel approach that uses the power of pretrained classifiers to visually guide users to noisy labels, and let them interactively check error candidates, t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(15 citation statements)
references
References 32 publications
0
15
0
Order By: Relevance
“…To handle label noise in such datasets, Xiang et al [17] tightly integrated a scalable trusted-item-based correction algorithm with an incremental t-SNE algorithm to support an iterative refinement procedure. Bäuerle et al [18] proposed three error detection measures, class interpretation error score, instance interpretation error score, and similarity error score, and leveraged them to correct label errors.…”
Section: Visualization For Annotation Quality Improvementmentioning
confidence: 99%
“…To handle label noise in such datasets, Xiang et al [17] tightly integrated a scalable trusted-item-based correction algorithm with an incremental t-SNE algorithm to support an iterative refinement procedure. Bäuerle et al [18] proposed three error detection measures, class interpretation error score, instance interpretation error score, and similarity error score, and leveraged them to correct label errors.…”
Section: Visualization For Annotation Quality Improvementmentioning
confidence: 99%
“…While there have been many visualization approaches targeted towards large datansets, such as active learning [28,30,35], interactive labeling [6,7], labeling process improvements [15,40], and data cleaning [11,59], they are all used during the labeling process. Fairness analysis, on the contrary, happens either after dataset collection has been completed (data fairness), or even after model training has been finished (model fairness).…”
Section: Related Workmentioning
confidence: 99%
“…Here users should be able to confirm or reject potentially problematic bias, thus cleaning up the data for actionable interventions. Drawing inspiration from other correction approaches in the machine learning domain [11,59], we propose to enable marking of problematic labels, and removal of unproblematic ones from the visualizations. Using these filtering approaches, which are applied from coarse (sensitive attribute) to fine (correlation values), and providing useful overview visualizations throughout, we follow Shneiderman's Mantra [51] of providing an overview first, then being able to zoom and filter.…”
Section: Design Guidelinesmentioning
confidence: 99%
“…After correction, the classifier is refined using the corrected labels, and a new round of correction starts. Bäuerle et al [16] developed three classifier-guided measures to detect data errors. Data errors are then presented in a matrix and a scatter plot, allowing experts to reason about and resolve errors.…”
Section: Label-level Improvementmentioning
confidence: 99%