The potential benefits and mechanistic effects of working memory training (WMT) in children are the subject of much research and debate. We show that after five weeks of school‐based, adaptive WMT 6–9 year‐old primary school children had greater activity in prefrontal and striatal brain regions, higher task accuracy, and reduced intra‐individual variability in response times compared to controls. Using a sequential sampling decision model, we demonstrate that this reduction in intra‐individual variability can be explained by changes to the evidence accumulation rates and thresholds. Critically, intra‐individual variability is useful in quantifying the immediate impact of cognitive training interventions, being a better predictor of academic skills and well‐being 6–12 months after the end of training than task accuracy. Taken together, our results suggest that attention control is the initial mechanism that leads to the long‐run benefits from adaptive WMT. Selective and sustained attention abilities may serve as a scaffold for subsequent changes in higher cognitive processes, academic skills, and general well‐being. Furthermore, these results highlight that the selection of outcome measures and the timing of the assessments play a crucial role in detecting training efficacy. Thus, evaluating intra‐individual variability, during or directly after training could allow for the early tailoring of training interventions in terms of duration or content to maximise their impact.