The World Wide Web Conference 2019
DOI: 10.1145/3308558.3313433
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning

Abstract: A fundamental question in any peer-to-peer ridesharing system is how to, both effectively and efficiently, dispatch user's ride requests to the right driver in real time. Traditional rule-based solutions usually work on a simplified problem setting, which requires a sophisticated hand-crafted weight design for either centralized authority control or decentralized multi-agent scheduling systems. Although recent approaches have used reinforcement learning to provide centralized combinatorial optimization algorit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
120
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2
2

Relationship

2
6

Authors

Journals

citations
Cited by 200 publications
(120 citation statements)
references
References 31 publications
0
120
0
Order By: Relevance
“…For all learning methods, following [13], we run 20 episodes for training, store the trained model periodically, and conduct the evaluation on the stored model with 5 random seeds. We compare the performance of different models regarding two criteria, including ADI, computed as the total income in a day, and ORR, calculated by the number of orders taken divided by the number of orders generated.…”
Section: Results Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…For all learning methods, following [13], we run 20 episodes for training, store the trained model periodically, and conduct the evaluation on the stored model with 5 random seeds. We compare the performance of different models regarding two criteria, including ADI, computed as the total income in a day, and ORR, calculated by the number of orders taken divided by the number of orders generated.…”
Section: Results Analysismentioning
confidence: 99%
“…It only assigns idle vehicles with available orders randomly at each timestep. • DQN: Li et al [13] conducted action-value function approximation based on Q-network. The Q-network is parameterized by a MLP with four hidden layers and we adopt the ReLU activation between hidden layers and to transform the final linear output of Q-network.…”
Section: Compared Methodsmentioning
confidence: 99%
“…RL is intended to capture the interactions between a large volume of vehicles in an adaptive manner. However, due to the curse of dimensionality, in practice, it is used in conjunction with an approximation technique, which often degrades the performance of this approach in large-scale fleet management [24,25]. RL methods also require a substantial amount of data to learn an efficient dispatch policy by capturing how to utilize various factors in a given transportation system [18,[26][27][28].…”
Section: Introductionmentioning
confidence: 99%
“…For example, the allocation of the ride request can be modeled by a combinatorial optimization problem and the acceptance of the allocated request can be modeled by the probability distribution [13]. Moreover, an independent and cooperative ride request allocating algorithm was also proposed [24]. In [38], an RL-based algorithm that can allocate the large-scale ride requests in real-time was proposed.…”
Section: Introductionmentioning
confidence: 99%
“…However, for centralized approaches, a critical issue is the potential "single point of failure" [18], i.e., the failure of the centralized authority control will fail the whole system [16]. Another two related work using multi-agent to learn order-dispatching is based on mean-eld MARL [13] and knowledge transferring [35].…”
Section: Introductionmentioning
confidence: 99%