Solving the Rubik’s cube with deep reinforcement learning and search

Agostinelli, Forest; McAleer, Stephen; Shmakov, Alexander; Baldi, Pierre

doi:10.1038/s42256-019-0070-z

Cited by 126 publications

(137 citation statements)

References 32 publications

Supporting

Mentioning

129

Contrasting

Order By: Relevance

“…This has been applied to a case study of solving a Rubik's Cube, and shown to have a advantages in terms of frequency of finding a solution and the size of the models needed when compared to a random forest-based LGF; however, the number of generations needed is, for more complex problems, larger. Compared to the work by Agostinelli et al (2019), the results for smaller problems are comparable but quicker to compute, but the combination of policy and value learning approach in that paper allows reliable solution of more complex problems compared to the approach in this paper that relies solely on value function approximation.…”

Section: Summary and Future Workmentioning

confidence: 94%

“…A similar approach has been taken in the recent papers by McAleer et al (2018) and Agostinelli et al (2019), though these are grounded in a reinforcement learning approach rather than a supervised learning approach. Compared with the work in this paper, their algorithm learns a mapping from points in the state space of the cube to a pair consisting of a value and a policy.…”

Section: Deep Learned Guidance Functionsmentioning

confidence: 99%

See 1 more Smart Citation

Solving the Rubik's cube with stepwise deep learning

Johnson

2021

Expert Systems

View full text Add to dashboard Cite

This paper explores a novel technique for learning the fitness function for search algorithms such as evolutionary strategies and hillclimbing. The aim of the new technique is to learn a fitness function (called a Learned Guidance Function) from a set of sample solutions to the problem. These functions are learned using a supervised learning approach based on deep neural network learning, that is, neural networks with a number of hidden layers. This is applied to a test problem: unscrambling the Rubik's Cube using evolutionary and hillclimbing algorithms. Comparisons are made with a previous LGF approach based on random forests, with a baseline approach based on traditional error‐based fitness, and with other approaches in the literature. This demonstrates how a fitness function can be learned from existing solutions, rather than being provided by the user, increasing the autonomy of AI search processes.

show abstract

Section: Summary and Future Workmentioning

confidence: 94%

Section: Deep Learned Guidance Functionsmentioning

confidence: 99%

Solving the Rubik's cube with stepwise deep learning

Johnson

2021

Expert Systems

View full text Add to dashboard Cite

show abstract

“…While the goal of this paper is to consider how machine-learning algorithms can be fairly compared to humans using video game benchmarks, a simplified case is also worth considering-the Rubik's Cube-which itself is no stranger to machine-learning optimisation (e.g., [31][32][33]). The Rubik's Cube, which arguably could be implemented as a video game itself, is a well-known game where the goal state is a 3 × 3 × 3 cube where each face of the cube is only a single colour, and otherwise is comprised of six unique colours.…”

Section: Learning From the Cubementioning

confidence: 99%

Considerations for Comparing Video Game AI Agents with Humans

Madan

2020

Challenges

View full text Add to dashboard Cite

Video games are sometimes used as environments to evaluate AI agents’ ability to develop and execute complex action sequences to maximize a defined reward. However, humans cannot match the fine precision of the timed actions of AI agents; in games such as StarCraft, build orders take the place of chess opening gambits. However, unlike strategy games, such as chess and Go, video games also rely heavily on sensorimotor precision. If the “finding” was merely that AI agents have superhuman reaction times and precision, none would be surprised. The goal is rather to look at adaptive reasoning and strategies produced by AI agents that may replicate human approaches or even result in strategies not previously produced by humans. Here, I will provide: (1) an overview of observations where AI agents are perhaps not being fairly evaluated relative to humans, (2) a potential approach for making this comparison more appropriate, and (3) highlight some important recent advances in video game play provided by AI agents.

show abstract

“…While the goal of this paper is to consider how machine learning algorithms can be fairly compared to humans using video-game benchmarks, a simplified case is also worth considering-the Rubik's Cube-which itself is no stranger to machine learning optimisation (e.g., Korf, 1999;El-Sourani et al, 2010;Agostinelli et al, 2019). The Rubik's Cube, which arguably could be implemented as a video game itself, is a well-known game where the goal state is a 3 × 3 × 3 cube where each face of the cube is only a single colour, and otherwise is comprised of six unique colours.…”

Section: Learning From the Cubementioning

confidence: 99%

Considerations for comparing video-game AI agents with humans

Madan¹

2020

Preprint

View full text Add to dashboard Cite

Video games are sometimes used as environments to evaluate AI agents' ability to develop and execute complex action sequences to maximize a defined reward. However, humans cannot match the fine precision of timed actions of AI agents--in games such as StarCraft, build orders take the place of chess opening gambits. However, unlike strategy games, such as chess and go, video games also rely heavily on sensorimotor precision. If the `finding' was merely that AI agents have superhuman reaction times and precision, none would be surprised. The goal is rather to look at adaptive reasoning and strategies produced by AI agents that may replicate human approaches or even result in strategies not previously produced by humans.Here I will provide: (1) an overview of observations where AI agents are perhaps not being fairly evaluated relative to humans, (2) a potential approach for making this comparison more appropriate, and (3) highlight some important recent advances in video-game play provided by AI agents.

show abstract

Solving the Rubik’s cube with deep reinforcement learning and search

Cited by 126 publications

References 32 publications

Solving the Rubik's cube with stepwise deep learning

Solving the Rubik's cube with stepwise deep learning

Considerations for Comparing Video Game AI Agents with Humans

Considerations for comparing video-game AI agents with humans

Contact Info

Product

Resources

About