2024
DOI: 10.1162/coli_a_00516
|View full text |Cite
|
Sign up to set email alerts
|

Analyzing Dataset Annotation Quality Management in the Wild

Jan-Christoph Klie,
Richard Eckart de Castilho,
Iryna Gurevych

Abstract: Data quality is crucial for training accurate, unbiased, and trustworthy machine learning models as well as for their correct evaluation. Recent works, however, have shown that even popular datasets used to train and evaluate state-of-the-art models contain a non-negligible amount of erroneous annotations, biases, or artifacts. While practices and guidelines regarding dataset creation projects exist, to our knowledge, large-scale analysis has yet to be performed on how quality management is conducted when crea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2025
2025

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
references
References 113 publications
0
0
0
Order By: Relevance