Fernando J. Yanez scite author profile

Fernando J. Yanez

2Publications

0Citation Statements Received

28Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Toronto

Publications

Order By: Most citations

Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization

Nogas¹,

Li²,

Yanez³

et al. 2021

Preprint

View full text Add to dashboard Cite

Multi-armed bandit algorithms like Thompson Sampling can be used to conduct adaptive experiments, in which maximizing reward means that data is used to progressively assign more participants to more effective arms. Such assignment strategies increase the risk of statistical hypothesis tests identifying a difference between arms when there is not one, and failing to conclude there is a difference in arms when there truly is one (Rafferty et al., 2019). We present simulations for 2-arm experiments that explore two algorithms that combine the benefits of uniform randomization for statistical analysis, with the benefits of reward maximization achieved by Thompson Sampling (TS). First, Top-Two Thompson Sampling (Russo, 2016) adds a fixed amount of uniform random allocation (UR) spread evenly over time. Second, a novel heuristic algorithm, called TS PostDiff (Posterior Probability of Difference). TS PostDiff takes a Bayesian approach to mixing TS and UR: the probability a participant is assigned using UR allocation is the posterior probability that the difference between two arms is 'small' (below a certain threshold), allowing for more UR exploration when there is little or no reward to be gained. We find that TS PostDiff method performs well across multiple effect sizes, and thus does not require tuning based on a guess for the true effect size.

show abstract

Three-Player Game Training Dynamics

Christofferson¹,

Yanez²

2022

Preprint

View full text Add to dashboard Cite

This work explores three-player game training dynamics, under what conditions three-player games converge and the equilibria the converge on. In contrast to prior work, we examine a three-player game architecture in which all players explicitly interact with each other. Prior work analyzes games in which two of three agents interact with only one other player, constituting dual two-player games. We explore three-player game training dynamics using an extended version of a simplified bilinear smooth game, called a simplified trilinear smooth game. We find that trilinear games do not converge on the Nash equilibrium in most cases, rather converging on a fixed point which is optimal for two players, but not for the third. Further, we explore how the order of the updates influences convergence. In addition to alternating and simultaneous updates, we explore a new update order-maximizer-first-which is only possible in a three-player game. We find that three-player games can converge on a Nash equilibrium using maximizer-first updates. Finally, we experiment with differing momentum values for each player in a trilinear smooth game under all three update orders and show that maximizer-first updates achieve more optimal results in a larger set of player-specific momentum value triads than other update orders.* Equal contribution. Listed alphabetically.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Fernando J. Yanez

Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization

Three-Player Game Training Dynamics

Contact Info

Product

Resources

About