This study explored (1) whether growth mixture modeling (GMM) could identify different trajectories of learning efficiency during a working memory (WM) training programme for young children diagnosed with Attention Deficit Hyperactivity Disorder (ADHD), compared with a typically developing (TD) control group, and (2) if learning trajectories and outcomes were different for simple and complex training tasks. Children completed simple visuospatial short-term memory (VSSTM) and complex visuospatial WM (VSWM) tasks for 15 min a day, 5 days a week, and for 8 weeks. Parent-reported executive functioning, and children's WM and attention control, educational achievement, and IQ were measured prior to (T1), immediately following (T2) and 3 months after training (T3). GMM analysis showed that WM training was represented as one learning curve, and there was no difference for the trajectories of the ADHD and TD groups. The learning trajectory for the VSSTM tasks across groups was represented as one learning curve and for the VSWM tasks there were three learning curves. Learning for the VSSTM tasks and for most children in the VSWM tasks was characterized by an inverted-U shape, indicating that training was effective for up to 15 sessions, was stable and declined thereafter, highlighting an optimal training timeframe. For the VSWM tasks, the two remaining groups showed either a U-shaped or a high inverted U-shaped trajectory, with the latter group achieving the highest T1T2 change score (i.e., children showed a lower starting point and the most gain in terms of learning and post-training performance). There were no broader benefits of training at post-test or follow-up. Further research should explore who would benefit most from intensive cognitive training, as well as the potential benefits for mental health and well-being.