Abstract. One of the main motivations for investigating hyper-heuristic methodologies is to provide a more general search framework than is currently available. Most of the current search techniques represent approaches that are largely adapted for specific search problems (and, in some cases, even specific problem instances). There are many real-world scenarios where the development of such bespoke systems is entirely appropriate. However, there are other situations where it would be beneficial to have methodologies which are more generally applicable to more problems. One of our motivating goals is to underpin the development of more flexible search methodologies that can be easily and automatically employed on a broader range of problems than is currently possible. Almost all the heuristics that have appeared in the literature have been designed and selected by humans. In this paper, we investigate a simulated annealing hyper-heuristic methodology which operates on a search space of heuristics and which employs a stochastic heuristic selection strategy and a short-term memory. The generality and performance of the proposed algorithm is demonstrated over a large number of benchmark data sets drawn from three very different and difficult (NP-hard) problems: nurse rostering, university course timetabling and onedimensional bin packing. Experimental results show that the proposed hyper-heuristic is able to achieve significant performance improvements over a recently proposed tabu search hyper-heuristic without lowering the level of generality. We also show that our hyper-heuristic is capable of producing competitive results against bespoke meta-heuristics methods for these problems. In some cases, the simulated annealing hyper-heuristic has even obtained considerable improvements over some of the current best problem-specific meta-heuristic approaches. The contribution of this paper is to present a method which can be readily (and automatically) applied to very different problems whilst still being able to produce results on benchmark problems which are competitive with bespoke human designed tailor made algorithms for those problems.