The ethical concept of fairness has recently been applied in machine learning (ML) settings to describe a wide range of constraints and objectives. When considering the relevance of ethical concepts to subset selection problems, the concepts of diversity and inclusion are additionally applicable in order to create outputs that account for social power and access differentials. We introduce metrics based on these concepts, which can be applied together, separately, and in tandem with additional fairness constraints. Results from human subject experiments lend support to the proposed criteria. Social choice methods can additionally be leveraged to aggregate and choose preferable sets, and we detail how these may be applied.
CCS CONCEPTS• Information systems → Information retrieval diversity; Evaluation of retrieval results. KEYWORDS machine learning fairness, subset selection, diversity and inclusion ACM Reference Format:
Online social media platforms increasingly rely on Natural Language Processing (NLP) techniques to detect abusive content at scale in order to mitigate the harms it causes to their users. However, these techniques suffer from various sampling and association biases present in training data, often resulting in sub-par performance on content relevant to marginalized groups, potentially furthering disproportionate harms towards them. Studies on such biases so far have focused on only a handful of axes of disparities and subgroups that have annotations/lexicons available. Consequently, biases concerning non-Western contexts are largely ignored in the literature. In this paper, we introduce a weakly supervised method to robustly detect lexical biases in broader geocultural contexts. Through a case study on a publicly available toxicity detection model, we demonstrate that our method identifies salient groups of cross-geographic errors, and, in a follow up, demonstrate that these groupings reflect human judgments of offensive and inoffensive language in those geographic contexts. We also conduct analysis of a model trained on a dataset with ground truth labels to better understand these biases, and present preliminary mitigation experiments.
The Hikurangi Margin east of New Zealand’s North Island hosts an
extensive gas hydrate province with numerous gas hydrate accumulations
related to the faulted structure of the accretionary wedge. One such
hydrate feature occurs in a small perched upper-slope basin known as
Urutī Basin. We investigate this hydrate accumulation by combining a
long-offset seismic line (10-km-long receiver array) with a grid of
high-resolution seismic lines acquired with a 600-m-long hydrophone
streamer. The long-offset data enable quantitative velocity analysis
while the high-resolution data constrain the three-dimensional geometry
of the hydrate accumulation. The sediments in Urutī Basin dip landward
due to ongoing deformation of the accretionary wedge. These strata are
clearly imaged in seismic data where they cross a distinct bottom
simulating reflection (BSR) that dips, counterintuitively, in the
opposite direction to the regional dip of the seafloor. BSR-derived heat
flow estimates reveal a distinct heat flow anomaly that coincides
spatially with the upper extent of a landward-verging thrust fault. We
present a conceptual model of this gas hydrate system that highlights
the roles of fault-controlled fluid flow at depth merging into
strata-controlled fluid flow into the hydrate stability zone. The result
is a layer-constrained accumulation of concentrated gas hydrate in the
dipping strata. Our study provides new insight into the interplay
between deep faulting, fluid flow and the shallow processes involved in
gas hydrate formation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.