2022 ACM Conference on Fairness, Accountability, and Transparency 2022
DOI: 10.1145/3531146.3533203
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive Sampling Strategies to Construct Equitable Training Datasets

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 34 publications
0
5
0
Order By: Relevance
“…This framework overall aims to capture the major considerations in operationalizing fairness that are quantifiable and enable benchmarking to some extent, as we believe that this helps practitioners decide how to make trade-offs between the pillars. We remark that while these are the classical categorizations in ML pipelines, there are still applications that use group data in ways that fall outside of these categories (for instance at the data collection step [10,39], we also propose one such method in the next section at the feature selection step). These methods should be considered, but our work focuses on bringing more structure and order to the majority of fairness intervention work in the highlighted categories [28].…”
Section: Model Performancementioning
confidence: 99%
“…This framework overall aims to capture the major considerations in operationalizing fairness that are quantifiable and enable benchmarking to some extent, as we believe that this helps practitioners decide how to make trade-offs between the pillars. We remark that while these are the classical categorizations in ML pipelines, there are still applications that use group data in ways that fall outside of these categories (for instance at the data collection step [10,39], we also propose one such method in the next section at the feature selection step). These methods should be considered, but our work focuses on bringing more structure and order to the majority of fairness intervention work in the highlighted categories [28].…”
Section: Model Performancementioning
confidence: 99%
“…The lessons are complex, since the effects of including additional samples from a particular group on the model’s performance in that group depend on a large number of factors. Promising approaches for adaptively deciding which groups to sample from have been proposed, 48 , 49 attempting to automatically detect harder groups during dataset construction and then sampling preferentially from those. Such approaches will prove challenging to implement in medical practice, however.…”
Section: The Path Forward: Leveling Upmentioning
confidence: 99%
“…13 and Cai et al. 48 suggest analyzing the trajectory of performance improvements in different groups as more samples are added, to identify groups that benefit the most from additional samples. Similarly, if some groups benefit from group balancing, this may indicate the presence of estimator bias due to insufficient model expressivity.…”
Section: The Path Forward: Leveling Upmentioning
confidence: 99%
“…Sampling is widely considered for dealing with the concerns of class imbalance and scalable analysis in machine learning [ 15 ]. Sampling strategies may have significant impacts on the performance given the fact that not all samples are equally important [ 23 , 24 ]. Previous studies in [ 15 , 25 , 26 , 27 , 28 ] considered utilizing sampling strategies (including stratified sampling) for mitigating the impact of the imbalance between malicious traffic (minority) vs. normal traffic (majority) in network intrusion/anomaly detection.…”
Section: Related Workmentioning
confidence: 99%