Jad Rahme scite author profile

Generalization is a central challenge for the deployment of reinforcement learning (RL) systems in the real world. In this paper, we show that the sequential structure of the RL problem necessitates new approaches to generalization beyond the well-studied techniques used in supervised learning. While supervised learning methods can generalize effectively without explicitly accounting for epistemic uncertainty, we show that, perhaps surprisingly, this is not the case in RL. We show that generalization to unseen test conditions from a limited number of training conditions induces implicit partial observability, effectively turning even fullyobserved MDPs into POMDPs. Informed by this observation, we recast the problem of generalization in RL as solving the induced partially observed Markov decision process, which we call the epistemic POMDP. We demonstrate the failure modes of algorithms that do not appropriately handle this partial observability, and suggest a simple ensemble-based technique for approximately solving the partially observed problem. Empirically, we demonstrate that our simple algorithm derived from the epistemic POMDP achieves significant gains in generalization over current methods on the Procgen benchmark suite.

show abstract

A Permutation-Equivariant Neural Network Architecture For Auction Design

Rahme¹,

Jelassi²,

Bruna³

et al. 2020

Preprint

View full text Add to dashboard Cite

Designing an incentive compatible auction that maximizes expected revenue is a central problem in Auction Design. Theoretical approaches to the problem have hit some limits in the past decades and analytical solutions are known for only a few simple settings. Computational approaches to the problem through the use of LPs have their own set of limitations. Building on the success of deep learning, a new approach was recently proposed by Dütting et al. (2017) in which the auction is modeled by a feed-forward neural network and the design problem is framed as a learning problem. The neural architectures used in that work are general purpose and do not take advantage of any of the symmetries the problem could present, such as permutation equivariance. In this work, we consider auction design problems that have permutation-equivariant symmetry and construct a neural architecture that is capable of perfectly recovering the permutationequivariant optimal mechanism, which we show is not possible with the previous architecture. We demonstrate that permutation-equivariant architectures are not only capable of recovering previous results, they also have better generalization properties.

show abstract

Auction learning as a two-player game

Rahme¹,

Jelassi²,

Weinberg³

2020

Preprint

View full text Add to dashboard Cite

Designing an incentive compatible auction that maximizes expected revenue is a central problem in Auction Design. While theoretical approaches to the problem have hit some limits, a recent research direction initiated by Duetting et al. (2019) consists in building neural network architectures to find optimal auctions. We propose two conceptual deviations from their approach which result in enhanced performance. First, we use recent results in theoretical auction design (Rubinstein and Weinberg, 2018) to introduce a time-independent Lagrangian. This not only circumvents the need for an expensive hyper-parameter search (as in prior work), but also provides a principled metric to compare the performance of two auctions (absent from prior work). Second,the optimization procedure in previous work uses an inner maximization loop to compute optimal misreports. We amortize this process through the introduction of an additional neural network. We demonstrate the effectiveness of our approach by learning competitive or strictly improved auctions compared to prior work. Both results together further imply a novel formulation of Auction Design as a two-player game with stationary utility functions.

show abstract

A Permutation-Equivariant Neural Network Architecture For Auction Design

Rahme

Jelassi

Bruna

et al. 2021

AAAI

View full text Add to dashboard Cite

Designing an incentive compatible auction that maximizes expected revenue is a central problem in Auction Design. Theoretical approaches to the problem have hit some limits in the past decades and analytical solutions are known for only a few simple settings. Computational approaches to the problem through the use of LPs have their own set of limitations. Building on the success of deep learning, a new approach was recently proposed by Duetting et al. (2019) in which the auction is modeled by a feed-forward neural network and the design problem is framed as a learning problem. The neural architectures used in that work are general purpose and do not take advantage of any of the symmetries the problem could present, such as permutation equivariance. In this work, we consider auction design problems that have permutation-equivariant symmetry and construct a neural architecture that is capable of perfectly recovering the permutation-equivariant optimal mechanism, which we show is not possible with the previous architecture. We demonstrate that permutation-equivariant architectures are not only capable of recovering previous results, they also have better generalization properties.

show abstract

Learning Algorithms for Intelligent Agents and Mechanisms

Rahme¹

2022

Preprint

View full text Add to dashboard Cite

The ability to learn from past experiences and adapt one's behavior accordingly within an environment or context to achieve a certain goal is a characteristic of a truly intelligent entity. Developing efficient, robust, and reliable learning algorithms towards that end is an active area of research and a major step towards achieving artificial general intelligence. In this thesis, we research learning algorithms for optimal decision making in two different contexts, Reinforcement Learning in Part I and Auction Design in Part II.Reinforcement learning (RL) is an area of machine learning that is concerned with how an agent should act in an environment in order to maximize its cumulative reward over time. In Chapter 2, inspired by statistical physics, we develop a novel approach to RL that not only learns optimal policies with enhanced desirable properties but also sheds new light on maximum entropy RL. In Chapter 3, we tackle the generalization problem in RL using a Bayesian perspective. We show that imperfect knowledge of the environment's dynamics effectively turn a fully-observed Markov Decision Process (MDP) into a Partially Observed MDP (POMDP) that we call the Epistemic POMDP.Informed by this observation, we develop a new policy learning algorithm LEEP which has improved generalization properties.An auction is the process of organizing the buying and selling of products and services that is of great practical importance. Designing an incentive compatible, individually rational auction that maximizes revenue is a challenging and intractable problem. Recently, a deep learning based approach was proposed to learn optimal auctions from data. While successful, this approach suffers from a few limitations, including sample inefficiency, lack of generalization to new auctions, and training difficulties. In Chapter 4, we construct a symmetry preserving neural network architecture, EquivariantNet, suitable for anonymous auctions. EquivariantNet is not only more sample efficient but is also able to learn auction rules that generalize well to iii other settings. In Chapter 5, we propose a novel formulation of the auction learning problem as a two player game. The resulting learning algorithm, ALGNet, is easier to train, more reliable and better suited for non stationary settings. iv First of all, I would like to thank my adviser, Ryan Adams, for his guidance and support throughout my doctoral studies at Princeton. I'm grateful for his availability, flexibility, and generosity, especially when it comes to sharing research ideas and insights on a wide range of topics.Throughout my PhD, I was fortunate to have the support of many Princeton faculty and administrative staff. I'm grateful to Matt Weinberg for his guidance and mentorship. His knowledge and expertise in mechanism design were critical when it came to bringing the second part of this thesis to fruition. I would also like to thank Peter Ramadge, Szymon Rusinkiewicz, Karthik Narasimhan for completing my thesis committee. I extend my gratitude to the supportive PACM...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jad Rahme

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

A Permutation-Equivariant Neural Network Architecture For Auction Design

Auction learning as a two-player game

A Permutation-Equivariant Neural Network Architecture For Auction Design

Learning Algorithms for Intelligent Agents and Mechanisms

Contact Info

Product

Resources

About