2021
DOI: 10.48550/arxiv.2102.11756
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep Policy Dynamic Programming for Vehicle Routing Problems

Abstract: Routing problems are a class of combinatorial problems with many practical applications. Recently, end-to-end deep learning methods have been proposed to learn approximate solution heuristics for such problems. In contrast, classical dynamic programming (DP) algorithms can find optimal solutions, but scale badly with the problem size. We propose Deep Policy Dynamic Programming (DPDP), which aims to combine the strengths of learned neural heuristics with those of DP algorithms. DPDP prioritizes and restricts th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
29
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(29 citation statements)
references
References 38 publications
0
29
0
Order By: Relevance
“…Joshi et al ( 2019) use a Graph Convolutional Network to construct TSP tours and show that by utilizing a parallelized beam search, auto-regressive construction approaches for the TSP can be outperformed. Kool et al (2021) extend the proposed model by Joshi et al (2019) for the CVRP while creating a hybrid approach that initiates partial solutions using a heatmap representation as a preprocessing step, before training a policy to create partial solutions and refining these through dynamic programming. Kaempfer & Wolf (2018) extend the learned heatmap approach to the number of tours to be constructed; their Permutation Invariant Pooling Network addresses the mTSP (a TSP involving multiple tours but no additional capacity constraints), where feasible solutions are obtained via a beam search and have been proven to outperform a meta-heuristic mTSP solver.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…Joshi et al ( 2019) use a Graph Convolutional Network to construct TSP tours and show that by utilizing a parallelized beam search, auto-regressive construction approaches for the TSP can be outperformed. Kool et al (2021) extend the proposed model by Joshi et al (2019) for the CVRP while creating a hybrid approach that initiates partial solutions using a heatmap representation as a preprocessing step, before training a policy to create partial solutions and refining these through dynamic programming. Kaempfer & Wolf (2018) extend the learned heatmap approach to the number of tours to be constructed; their Permutation Invariant Pooling Network addresses the mTSP (a TSP involving multiple tours but no additional capacity constraints), where feasible solutions are obtained via a beam search and have been proven to outperform a meta-heuristic mTSP solver.…”
Section: Related Workmentioning
confidence: 99%
“…In this section, we provide generalization results for the test set provided in Kool et al (2021). We compare the LKH method and DPDP (Kool et al (2021)) to the generalization results that were achieved after training our model on uniformly-distributed data of graph size 50 only.…”
Section: D4 Evaluation On More Realistic Test Instancesmentioning
confidence: 99%
See 3 more Smart Citations