2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00980
|View full text |Cite
|
Sign up to set email alerts
|

REPAIR: Removing Representation Bias by Dataset Resampling

Abstract: Modern machine learning datasets can have biases for certain representations that are leveraged by algorithms to achieve high performance without learning to solve the underlying task. This problem is referred to as "representation bias". The question of how to reduce the representation biases of a dataset is investigated and a new dataset REPresentAtion bIas Removal (REPAIR) procedure is proposed. This formulates bias minimization as an optimization problem, seeking a weight distribution that penalizes exampl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
135
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 210 publications
(136 citation statements)
references
References 34 publications
1
135
0
Order By: Relevance
“…the ability of simple linear classifiers to predict them correctly. While AFLite, among others (Li and Vasconcelos, 2019;Gururangan et al, 2018), advocate removing "easy" instances from the dataset, our work shows that easy-to-learn instances can be useful. Similar intuitions have guided other work such as curriculum learning (Bengio et al, 2009) and self-paced learning (Kumar et al, 2010;Lee and Grauman, 2011) where all examples are prioritized based on their "difficulty".…”
Section: Related Workmentioning
confidence: 85%
“…the ability of simple linear classifiers to predict them correctly. While AFLite, among others (Li and Vasconcelos, 2019;Gururangan et al, 2018), advocate removing "easy" instances from the dataset, our work shows that easy-to-learn instances can be useful. Similar intuitions have guided other work such as curriculum learning (Bengio et al, 2009) and self-paced learning (Kumar et al, 2010;Lee and Grauman, 2011) where all examples are prioritized based on their "difficulty".…”
Section: Related Workmentioning
confidence: 85%
“…This renders the above methods to account for confounders unsuitable as they either result in reduced number of training samples (e.g., matching or stratification) or require deterministic features that are computed beforehand (e.g., standardization or regression). Possible alternatives could be unbiased [17][18][19][20][21] and invariant feature-learning approaches [22][23][24][25] relying on end-to-end training to study the invariance (independence) between the learned feature F and a bias factor (① in Fig. 2b).…”
mentioning
confidence: 99%
“…There are also debiasing strategies that identify hard examples in the dataset and re-train the model to focus on those examples (Yaghoobzadeh et al, 2019;Li and Vasconcelos, 2019;Le Bras et al, 2020). This approach reflects a similar in-tuition that simplicity is connected to dataset bias, although our method is able to explicitly model the bias and does not assume a pool of bias-free examples exist within the training data.…”
Section: Related Workmentioning
confidence: 99%