SummaryWith the development of technology, research related to music‐based dance generation models has been increasing. Some of the studies have applied algorithms to dance generation models, but these algorithms suffer from problems such as the inability to make an exact match between music and dance. To solve these problems, the research innovatively proposes a deep learning toddler dance generation model based on music rhythm and beat to extract music and dance features respectively. In addition, the study generates smooth dances through a generator module, improves the match between the dances and music generated by the model through a discriminator, and enhances the representativeness of audio features through a self‐encoder module. Finally, the study completes the validation of the model by comparing the loss functions and other aspects. Results demonstrate that this model has the smallest loss value of 17.58, and that the model generates a better match between dance and music for different music, with values of 8.9, 8, and 7.4. Compared with the comparison model, the study's model for generating dances for young children has better results in terms of dance generation. The study solves the problem that previous algorithms cannot make an exact match between music and dance, and has implications for fields such as cross‐modal generation and games.