Despite the ubiquity of transportation data, methods to infer the state parameters of a network either ignore sensitivity of route decisions, require route enumeration for parameterizing descriptive models of route selection, or require complex bilevel models of route assignment behavior. These limitations prevent modelers from fully exploiting ubiquitous data in monitoring transportation networks. Inverse optimization methods that capture network route choice behavior can address this gap, but they are designed to take observations of the same model to learn the parameters of that model, which is statistically inefficient (e.g. requires estimating population route and link flows). New inverse optimization models and supporting algorithms are proposed to learn the parameters of heterogeneous travelers' route behavior to infer shared network state parameters (e.g. link capacity dual prices). The inferred values are consistent with observations of each agent's optimization behavior. We prove that the method can obtain unique dual prices for a network shared by these agents in polynomial time. Four experiments are conducted. The first one, conducted on a 4-node network, verifies the methodology to obtain heterogeneous link cost parameters even when multinomial or mixed logit models would not be meaningfully estimated. The second is a parameter recovery test on the Nguyen-Dupuis network that shows that unique latent link capacity dual prices can be inferred using the proposed method. The third test on the same network demonstrates how a monitoring system in an online learning environment can be designed using this method. The last test demonstrates this learning on real data obtained from a freeway network in Queens, New York, using only real-time Google Maps queries.
Traditionally vehicles act only as servers in transporting passengers and goods. With increasing sensor equipment in vehicles, including automated vehicles, there is a need to test algorithms that consider the dual role of vehicles as both servers and sensors. The paper formulates a sequential route selection problem as a shortest path problem with on-time arrival reliability under a multi-armed bandit setting, a type of reinforcement learning model. A decision-maker has to make a finite set of decisions sequentially on departure time and path between a fixed origin-destination pair such that on-time reliability is maximized while travel time is minimized. The upper confidence bound algorithm is extended to handle this problem. Several tests are conducted. First, simulated data successfully verifies the method, then a real-data scenario is constructed of a hotel shuttle service from midtown Manhattan in New York City providing hourly access to John F. Kennedy International Airport. Results suggest that route selection with multi-armed bandit learning algorithms can be effective but neglecting passenger scheduling constraints can have negative effects on on-time arrival reliability by as much as 4.8% and combined reliability and travel time by 66.1%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.