Documentation to facilitate communication between dataset creators and consumers.
The machine learning community currently has no standardized process for documenting datasets. To address this gap, we propose datasheets for datasets. In the electronics industry, every component, no matter how simple or complex, is accompanied with a datasheet that describes its operating characteristics, test results, recommended uses, and other information. By analogy, we propose that every dataset be accompanied with a datasheet that documents its motivation, composition, collection process, recommended uses, and so on. Datasheets for datasets will facilitate better communication between dataset creators and dataset consumers, and encourage the machine learning community to prioritize transparency and accountability.
No abstract
African-American Male subgroup Detailed comparison of the groups Caucasian Male and African-American Male Figure 1: FAIRVIS integrates multiple coordinated views for discovering intersectional bias. Above, our user investigates the intersectional subgroups of sex and race. A. The Feature Distribution View allows users to visualize each feature's distribution and generate subgroups. B. The Subgroup Overview lets users select various fairness metrics to see the global average per metric and compare subgroups to one another, e.g., pinned Caucasian Males versus hovered African-American Males. The plots for Recall and False Positive Rate show that for African-American Males, the model has relatively high recall but also the highest false positive rate out of all subgroups of sex and race. C. The Detailed Comparison View lets users compare the details of two groups and investigate their class balances. Since the difference in False Positive Rates between Caucasian Males and African-American Males is far larger than their difference in base rates, a user suspects this part of the model merits further inquiry. D. The Suggested and Similar Subgroup View shows suggested subgroups ranked by the worst performance in a given metric.
Strategic network formation arises in settings where agents receive some benefit from their connectedness to other agents, but also incur costs for forming these links. We consider a new network formation game that incorporates an adversarial attack, as well as immunization or protection against the attack. An agent's network benefit is the expected size of her connected component post-attack, and agents may also choose to immunize themselves from attack at some additional cost. Our framework can be viewed as a stylized model of settings where reachability rather than centrality is the primary interest (as in many technological networks such as the Internet), and vertices may be vulnerable to attacks (such as viruses), but may also reduce risk via potentially costly measures (such as an anti-virus software).The reachability network benefit model has been studied in the setting without attack or immunization [4], where it is known that the set of equilibrium networks is the empty graph as well as any tree. We show that the introduction of attack and immunization changes the game in dramatic ways; in particular, many new equilibrium topologies emerge, some more sparse and some more dense than trees. Our interests include the characterization of equilibrium graphs, and the social welfare costs of attack and immunization.Our main theoretical contributions include a strong bound on the edge density at equilibrium. In particular, we show that under a very mild assumption on the adversary's attack model, every equilibrium network contains at most only 2n − 4 edges for n ≥ 4, where n denotes the number of agents and this upper bound is tight. This demonstrates that despite permitting topologies denser than trees, the amount of "over-building" introduced by attack and immunization is sharply limited. We also show that social welfare does not significantly erode: every non-trivial equilibrium in our model with respect to several adversarial attack models asymptotically has social welfare at least as that of any equilibrium in the original attack-free model.We complement our sharp theoretical results with simulations demonstrating fast convergence of a bounded rationality dynamic, swapstable best response, which generalizes linkstable best response but is considerably more powerful in our model. The simulations further elucidate the wide variety of asymmetric equilibria possible and demonstrate topological consequences of the dynamics, including heavy-tailed degree distributions arising from immunization. Finally, we report on a behavioral experiment on our game with over 100 participants, where despite the complexity of the game, the resulting network was surprisingly close to equilibrium. * The short version of this paper [12] appears in the proceedings of WINE-16.Definition 2. The random attack adversary attacks a vulnerable vertex uniformly at random.So every vulnerable vertex is targeted with respect to the random attack adversary and the adversary induces a distribution over targeted regions such that the probability ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.