2020 IEEE International Conference on Big Data (Big Data) 2020
DOI: 10.1109/bigdata50022.2020.9378191
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic Dispatching for Large-Scale Heterogeneous Fleet via Multi-agent Deep Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 21 publications
(13 citation statements)
references
References 12 publications
0
13
0
Order By: Relevance
“…As this process is repeated, the agent gradually learns to seek out positive actions and avoid negative ones and eventually determines an optimal (maximum‐reward) path to its goal. [ 18–20 ] Unlike supervised learning algorithms, reinforcement learning does not need correct answers or targets to solve a given problem, it only needs information on whether an answer from the ML agent is in the correct direction. Reinforcement learning has useful applications in robotics, as well as optimization.…”
Section: Ai Paradigms Techniques and Workflowsmentioning
confidence: 99%
See 1 more Smart Citation
“…As this process is repeated, the agent gradually learns to seek out positive actions and avoid negative ones and eventually determines an optimal (maximum‐reward) path to its goal. [ 18–20 ] Unlike supervised learning algorithms, reinforcement learning does not need correct answers or targets to solve a given problem, it only needs information on whether an answer from the ML agent is in the correct direction. Reinforcement learning has useful applications in robotics, as well as optimization.…”
Section: Ai Paradigms Techniques and Workflowsmentioning
confidence: 99%
“…Recurrent Neural Networks have been used to optimize delivery and dispatch services by autonomous guided vehicles while avoiding conflicts with workers or other vehicles. [ 54 ] RL has also been used to optimize dispatch within a facility [ 20 ] (such as a factory floor or warehouse), and for job‐shop scheduling—where one product requiring several tasks which must be completed on separate machines while ensuring an optimal layout of equipment. [ 55 ] Other use case applications for RL include improving the ability of robots to identify and pick out an object from specific bins, [ 56 ] select the best paths to minimize unnecessary stops, and avoid obstacles and interference with human operators.…”
Section: Ai/ml Applications In Manufacturingmentioning
confidence: 99%
“…Foerster, Assael, Freitas, and Whiteson proposed a single network with shared parameters to reduce the number of learned parameters and speed up the learning [12]. In a previous work, we applied a similar approach to address a dynamic dispatching problem in an industrial multi-agent environment [13]. Having a shared policy among agents solves the agent failure challenge and addresses the non-stationary problem to some extent.…”
Section: Multi-agent Deep Rl Literature Reviewmentioning
confidence: 99%
“…By modeling each router using a DQN, each router is able to account for heterogeneous data about its environment, which allows for the optimization of more complicated cost functions, such as the simultaneous optimization of bag delivery time and energy consumption in a baggage handling system. Zhang et al (2020a) proposed a centralized multi-agent DQN approach for the open-pit mining operational planning (OPMOP) problem (an NP-hard problem that seeks to balance the tradeoffs between mine productivity and operational costs), which works based on learning the memories from heterogeneous agents. Open-pit mine dispatch decisions coordinate the route planning of trucks to shovels and dumps for the loading and delivery of ore.…”
Section: Pathfinding + Schedulingmentioning
confidence: 99%
“… Zhang et al (2020a) proposed a centralized multi-agent DQN approach for the open-pit mining operational planning (OPMOP) problem (an NP-hard problem that seeks to balance the tradeoffs between mine productivity and operational costs), which works based on learning the memories from heterogeneous agents. Open-pit mine dispatch decisions coordinate the route planning of trucks to shovels and dumps for the loading and delivery of ore.…”
Section: Applicationsmentioning
confidence: 99%