The importance of programming education has led to dedicated educational programming environments, where users visually arrange block-based programming constructs that typically control graphical, interactive game-like programs. The Scratch programming environment is particularly popular, with more than 90 million registered users at the time of this writing. While the block-based nature of Scratch helps learners by preventing syntactical mistakes, there nevertheless remains a need to provide feedback and support in order to implement desired functionality. To support individual learning and classroom settings, this feedback and support should ideally be provided in an automated fashion, which requires tests to enable dynamic program analysis. In prior work we introduced Whisker, a framework that enables automated testing of Scratch programs. However, creating these automated tests for Scratch programs is challenging. In this paper, we therefore investigate how to automatically generate Whisker tests. Generating tests for Scratch raises important challenges: First, game-like programs are typically randomised, leading to flaky tests. Second, Scratch programs usually consist of animations and interactions with long delays, inhibiting the application of classical test generation approaches. Thus, the new application domain raises the question of which test generation technique is best suited to produce high coverage tests capable of detecting faulty behaviour. We investigate these questions using an extension of the Whisker test framework for automated test generation. Evaluation on common programming exercises, a random sample of 1000 Scratch user programs, and the 1000 most popular Scratch programs demonstrates that our approach enables Whisker to reliably accelerate test executions, and even though many Scratch programs are small and easy to cover, there are many unique challenges for which advanced search-based test generation using many-objective algorithms is needed in order to achieve high coverage.