While the original GMRES(m) iterative solver assumes the restart parameter m stays fixed throughout the solve, in practic varying m can improve the convergence behavior of the solver. Previous work tried to take advantage of this fact by choosing the restart value at random for each outer iteration or by adaptively changing the restart value based on a measure of the progress made towards computing the solution in successive iterations. In this work a novel application of reinforcement learning to the problem of adaptively choosing values of m is described and then compared to the two existing strategies on matrices from a range of application areas.