Game-theoretic Vocabulary Selection via the Shapley Value and Banzhaf Index

Patel, Roma; Garnelo, Marta; Gemp, Ian; Dyer, Chris; Bachrach, Yoram

doi:10.18653/v1/2021.naacl-main.223

Cited by 8 publications

(5 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, SHAP has a fast implementation for tree-based models. Although Shapley value computation requires exponential time complexity, machine learning applications employ Shapley value approximation methods, such as Monte Carlo permutation sampling, which approximates Shapley value in linear time [41][42][43].…”

Section: Shapley Valuesmentioning

confidence: 99%

Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India

et al. 2023

View full text Add to dashboard Cite

This paper discusses the importance of investigating DR using machine learning and a computational method to rank DR risk factors by importance using different machine learning models. The dataset was collected from four large population-based studies conducted in India between 2001 and 2010 on the prevalence of DR and its risk factors. We deployed different machine learning models on the dataset to rank the importance of the variables (risk factors). The study uses a t-test and Shapely additive explanations (SHAP) to rank the risk factors. Then, it uses five machine learning models (K-Nearest Neighbor, Decision Tree, Support Vector Machines, Logistic Regression, and Naive Bayes) to identify the unimportant risk factors based on the area under the curve criterion to predict DR. To determine the overall significance of risk variables, a weighted average of each classifier’s importance is used. The ranking of risk variables is provided to machine learning models. To construct a model for DR prediction, the combination of risk factors with the highest AUC is chosen. The results show that the risk factors glycosylated hemoglobin and systolic blood pressure were present in the top three risk factors for DR in all five machine learning models when the t-test was used for ranking. Furthermore, the risk factors, namely, systolic blood pressure and history of hypertension, were present in the top five risk factors for DR in all the machine learning models when SHAP was used for ranking. Finally, when an ensemble of the five machine learning models was employed, independently with both the t-test and SHAP, systolic blood pressure and diabetes mellitus duration were present in the top four risk factors for diabetic retinopathy. Decision Tree and K-Nearest Neighbor resulted in the highest AUCs of 0.79 (t-test) and 0.77 (SHAP). Moreover, K-Nearest Neighbor predicted DR with 82.6% (t-test) and 78.3% (SHAP) accuracy.

show abstract

Section: Shapley Valuesmentioning

confidence: 99%

Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India

et al. 2023

View full text Add to dashboard Cite

show abstract

“…The wide class of semivalues applies to many different realworld applications, such as voting [28], interpretable machine learning [26,27], reinforcement learning [24,14], text summarisation [32], etc. The equality of weights over coalition sizes is not a constraint for the application, but a choice of design that allows the semivalues to take into account the influence of a player on groups of different sizes while preserving anonymity and symmetry.…”

Section: A Semivalues As Weighted Average Marginal Contributionsmentioning

confidence: 99%

Replication Robust Payoff Allocation in Submodular Cooperative Games

Han

Wooldridge

Rogers

et al. 2023

IEEE Trans. Artif. Intell.

View full text Add to dashboard Cite

Submodular functions have been a powerful mathematical model for a wide range of real-world applications. Recently, submodular functions are becoming increasingly important in machine learning (ML) for modelling notions such as information and redundancy among entities such as data and features. Among these applications, a key question is payoff allocation, i.e., how to evaluate the importance of each entity towards a collective objective? To this end, classic solution concepts from cooperative game theory offer principled approaches to payoff allocation. However, despite the extensive body of gametheoretic literature, payoff allocation in submodular games is relatively under-researched. In particular, an important notion that arises in the emerging submodular applications is redundancy, which may occur from various sources such as abundant data or malicious manipulations where a player replicates its resource and acts under multiple identities. Though many gametheoretic solution concepts can be directly used in submodular games, naively applying them for payoff allocation in these settings may incur robustness issues against replication. In this paper, we systematically study the replication manipulation in submodular games and investigate replication robustness, a metric that quantitatively measures the robustness of solution concepts against replication. Using this metric, we present conditions which theoretically characterise the robustness of semivalues, a wide family of solution concepts including the Shapley and Banzhaf value. Moreover, we empirically validate our theoretical results on an emerging submodular ML application-ML data markets.Impact Statement-With the increasing take-up of ML techniques in real-world settings, payoff allocation has significant impact towards fairness, trustworthiness, safety, and knowledge discovery in ML applications, e.g., performing analysis or debugging of ML systems by finding the key contributors or bottleneck entities. Many emerging ML applications exhibit submodular characteristics, while properties of classic game-theoretic payoff allocation on submodular games are under-researched. This paper investigats an important issue of redundancy arising from replication in submodular ML applications. Using a replication robustness metric, we provide theoretical guarantees for the robustness of common game-theoretic payoff allocation methods against replication. Our findings can guide the use of gametheoretic payoff allocation in submodular ML applications, and impact real-world applications and future research on payoff allocation in ML systems in general, such as fair compensation in multi-party ML systems and feature importance interpretation in the medical domains.

show abstract

“…Lastly, in §3.6 we mentioned how we used Shapley values from cooperative game theory to optimally select features, in this case vocabularies, in language tasks [84].…”

Section: Gamificationmentioning

confidence: 99%

Developing, evaluating and scaling learning agents in multi-agent environments

Gemp

Anthony

Bachrach

et al. 2022

AIC

Self Cite

View full text Add to dashboard Cite

The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks. A signature aim of our group is to use the resources and expertise made available to us at DeepMind in deep reinforcement learning to explore multi-agent systems in complex environments and use these benchmarks to advance our understanding. Here, we summarise the recent work of our team and present a taxonomy that we feel highlights many important open challenges in multi-agent research.

show abstract

Game-theoretic Vocabulary Selection via the Shapley Value and Banzhaf Index

Cited by 8 publications

References 45 publications

Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India

Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India

Replication Robust Payoff Allocation in Submodular Cooperative Games

Developing, evaluating and scaling learning agents in multi-agent environments

Contact Info

Product

Resources

About