Tools for automatic grading programming assignments, also known as Online Judges, have been widely used to support computer science (CS) courses. Nevertheless, few studies have used these tools to acquire and analyse interaction data to better understand the students’ performance and behaviours, often due to data availability or inadequate granularity. To address this problem, we propose an Online Judge called CodeBench, which allows for fine‐grained data collection of student interactions, at the level of, eg, keystrokes, number of submissions, and grades. We deployed CodeBench for 3 years (2016–18) and collected data from 2058 students from 16 introductory computer science (CS1) courses, on which we have carried out fine‐grained learning analytics, towards early detection of effective/ineffective behaviours regarding learning CS concepts. Results extract clear behavioural classes of CS1 students, significantly differentiated both semantically and statistically, enabling us to better explain how student behaviours during programming have influenced learning outcomes. Finally, we also identify behaviours that can guide novice students to improve their learning performance, which can be used for interventions. We believe this work is a step forward towards enhancing Online Judges and helping teachers and students improve their CS1 teaching/learning practices.