A new multiagent reinforcement learning algorithm to solve the symmetric traveling salesman problem

Alipour, Mir Mohammad; Razavi, Seyed Naser

doi:10.3233/mgs-150232

Cited by 15 publications

(6 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is important to point out that refueling problems are usually classified into four groups [27]: refueling with fixed route, refueling with variable route, TSP with uniform cost at each point and TSP with the fuel cost varying in the localities. In this sense, the last class can be applied to treat refueling in road networks in Brazil, where fuel price variations are found in each city according to data from the Brazilian National Petroleum Agency (ANP) 1 .…”

Section: Introductionmentioning

confidence: 99%

“…Reinforcement learning (RL) is an artificial intelligence technique with relevant applications in robotics [8,15,[28][29][30]37], path planning [20,39,47,59,75,76] and combinatorial optimization problems [4,7,13,14,21,44,53,54,64,79], such as the TSP [1,2,18,22,41,45,52,66,81]. In RL, an agent learns from rewards and penalties in interacting with an environment [68].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Reinforcement learning for the traveling salesman problem with refueling

Ottoni

Nepomuceno

Oliveira

et al. 2021

Complex Intell. Syst.

View full text Add to dashboard Cite

The traveling salesman problem (TSP) is one of the best-known combinatorial optimization problems. Many methods derived from TSP have been applied to study autonomous vehicle route planning with fuel constraints. Nevertheless, less attention has been paid to reinforcement learning (RL) as a potential method to solve refueling problems. This paper employs RL to solve the traveling salesman problem With refueling (TSPWR). The technique proposes a model (actions, states, reinforcements) and RL-TSPWR algorithm. Focus is given on the analysis of RL parameters and on the refueling influence in route learning optimization of fuel cost. Two RL algorithms: Q-learning and SARSA are compared. In addition, RL parameter estimation is performed by Response Surface Methodology, Analysis of Variance and Tukey Test. The proposed method achieves the best solution in 15 out of 16 case studies.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Reinforcement learning for the traveling salesman problem with refueling

Ottoni

Nepomuceno

Oliveira

et al. 2021

Complex Intell. Syst.

View full text Add to dashboard Cite

show abstract

“…The RL has been applied in many fields, such as in robotics, control, multiagent systems and optimization (Gambardella and Dorigo 2000;Kober et al 2013;Shao et al 2014;Bianchi et al 2015;Yliniemi and Tumer 2016;Da Silva et al 2019;Mnih et al 2015;Asiain et al 2019;Alipour et al 2018;Carvalho et al 2019;Li et al 2019;Low et al 2019;Bazzan 2019;Da Silva et al 2019). A growing interesting to apply the RL can be seen in combinatorial optimization (Gambardella and Dorigo 1995;Likas et al 1995;Miagkikh and Punch 1999;Mariano and Morales 2000;Sun et al 2001;Ma et al 2008;Liu and Zeng 2009;Lima Júnior et al 2010;Santos et al 2014;Alipour and Razavi 2015;Alipour et al 2018;Ottoni et al 2018;Woo et al 2018;Miki et al 2018;Chhabra and Warn 2019), such as the travelling salesman problem (TSP) (Gambardella and Dorigo 1995;Alipour et al 2018), Job-Shop Problem (Zhang and Dietterich 1995;Cunha et al 2020), the K-Server Problem (Costa et al 2016) and the multidimensional knapsack problem (MKP) (Arin and Rabadi 2017;Ottoni et al 2017). Although, it seems evident that a great number of works have been devoted to solving combinatorial optimization, less attention has been paid to the sequential ordering problem (SOP)…”

Section: Introductionmentioning

confidence: 99%

Tuning of reinforcement learning parameters applied to SOP using the Scott–Knott method

et al. 2019

View full text Add to dashboard Cite

In this paper, we present a technique to tune the reinforcement learning (RL) parameters applied to the sequential ordering problem (SOP) using the Scott-Knott method. The RL has been widely recognized as a powerful tool for combinatorial optimization problems, such as travelling salesman and multidimensional knapsack problems. It seems, however, that less attention has been paid to solve the SOP. Here, we have developed a RL structure to solve the SOP that can partially fill that gap. Two traditional RL algorithms, Q-learning and SARSA, have been employed. Three learning specifications have been adopted to analyze the performance of the RL: algorithm type, reinforcement learning function, and parameter. A complete factorial experiment and the Scott-Knott method are used to find the best combination of factor levels, when the source of variation is statistically different in analysis of variance. The performance of the proposed RL has been tested using benchmarks from the TSPLIB library. In general, the selected parameters indicate that SARSA overwhelms the performance of Q-learning.

show abstract

“…There are many applications of the shortest Hamiltonian path (SHP) problem, e.g. travelling salesman problem [2,3], routing problem with time windows [9], vehicle routing problem [8,9], generalized travelling salesman problem [38,41], warehouse management [33], etc. In a variant of the vehicle routing problem, there are some clusters of the customers and the capacitated vehicle should traverse the clusters to serve demands [38].…”

mentioning

confidence: 99%

“…Bula [7] modeled 322 MOHSEN ABDOLHOSSEINZADEH AND MIR MOHAMMAD ALIPOUR a hazardous vehicle routing problem as some Hamiltonian circuit with the same start and end nodes in a given depot node. Stetsyuk [35] considered a complete graph and he formulated the problem as a mixed-integer problem with at most 2n 2 variables and (n + 1)…”

mentioning

confidence: 99%

Design of experiment for tuning parameters of an ant colony optimization method for the constrained shortest Hamiltonian path problem in the grid networks

Abdolhosseinzadeh¹,

Alipour²

2021

NACO

Self Cite

View full text Add to dashboard Cite

In a grid network, the nodes could be traversed either horizontally or vertically. The constrained shortest Hamiltonian path goes over the nodes between a source node and a destination node, and it is constrained to traverse some nodes at least once while others could be traversed several times. There are various applications of the problem, especially in routing problems. It is an NP-complete problem, and the well-known Bellman-Held-Karp algorithm could solve the shortest Hamiltonian circuit problem within O(2 n n 2 ) time complexity; however, the shortest Hamiltonian path problem is more complicated. So, a metaheuristic algorithm based on ant colony optimization is applied to obtain the optimal solution. The proposed method applies the rooted shortest path tree structure since in the optimal solution the paths between the restricted nodes are the shortest paths. Then, the shortest path tree is obtained by at most O(n 3 ) time complexity at any iteration and the ants begin to improve the solution and the optimal solution is constructed in a reasonable time. The algorithm is verified by some numerical examples and the ant colony parameters are tuned by design of experiment method, and the optimal setting for different size of networks are determined.

show abstract

A new multiagent reinforcement learning algorithm to solve the symmetric traveling salesman problem

Cited by 15 publications

References 32 publications

Reinforcement learning for the traveling salesman problem with refueling

Reinforcement learning for the traveling salesman problem with refueling

Tuning of reinforcement learning parameters applied to SOP using the Scott–Knott method

Design of experiment for tuning parameters of an ant colony optimization method for the constrained shortest Hamiltonian path problem in the grid networks

Contact Info

Product

Resources

About