We consider the problem of modeling trajectories of drivers in a road network from the perspective of inverse reinforcement learning. As rational agents, drivers are trying to maximize some reward function unknown to an external observer as they make up their trajectories. We apply the concept of random utility from microeconomic theory to model the unknown reward function as a function of observable features plus an error term which represents features known only to the driver. We develop a parameterized generative model for the trajectories based on a random utility Markov decision process formulation of drivers decisions. We show that maximum entropy inverse reinforcement learning is a particular case of our proposed formulation when we assume a Gumbel density function for the unobserved reward error terms. We illustrate Bayesian inference on model parameters through a case study with real trajectory data from a large city obtained from sensors placed on sparsely distributed points on the street network.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.