The ability to learn from past experiences and adapt one's behavior accordingly within an environment or context to achieve a certain goal is a characteristic of a truly intelligent entity. Developing efficient, robust, and reliable learning algorithms towards that end is an active area of research and a major step towards achieving artificial general intelligence. In this thesis, we research learning algorithms for optimal decision making in two different contexts, Reinforcement Learning in Part I and Auction Design in Part II.Reinforcement learning (RL) is an area of machine learning that is concerned with how an agent should act in an environment in order to maximize its cumulative reward over time. In Chapter 2, inspired by statistical physics, we develop a novel approach to RL that not only learns optimal policies with enhanced desirable properties but also sheds new light on maximum entropy RL. In Chapter 3, we tackle the generalization problem in RL using a Bayesian perspective. We show that imperfect knowledge of the environment's dynamics effectively turn a fully-observed Markov Decision Process (MDP) into a Partially Observed MDP (POMDP) that we call the Epistemic POMDP.Informed by this observation, we develop a new policy learning algorithm LEEP which has improved generalization properties.An auction is the process of organizing the buying and selling of products and services that is of great practical importance. Designing an incentive compatible, individually rational auction that maximizes revenue is a challenging and intractable problem. Recently, a deep learning based approach was proposed to learn optimal auctions from data. While successful, this approach suffers from a few limitations, including sample inefficiency, lack of generalization to new auctions, and training difficulties. In Chapter 4, we construct a symmetry preserving neural network architecture, EquivariantNet, suitable for anonymous auctions. EquivariantNet is not only more sample efficient but is also able to learn auction rules that generalize well to iii other settings. In Chapter 5, we propose a novel formulation of the auction learning problem as a two player game. The resulting learning algorithm, ALGNet, is easier to train, more reliable and better suited for non stationary settings. iv First of all, I would like to thank my adviser, Ryan Adams, for his guidance and support throughout my doctoral studies at Princeton. I'm grateful for his availability, flexibility, and generosity, especially when it comes to sharing research ideas and insights on a wide range of topics.Throughout my PhD, I was fortunate to have the support of many Princeton faculty and administrative staff. I'm grateful to Matt Weinberg for his guidance and mentorship. His knowledge and expertise in mechanism design were critical when it came to bringing the second part of this thesis to fruition. I would also like to thank Peter Ramadge, Szymon Rusinkiewicz, Karthik Narasimhan for completing my thesis committee. I extend my gratitude to the supportive PACM...