Given the recent development of mobile gaze-tracking devices it has become possible to view and interpret what the student sees and unravel the associated problem-solving processes further. It has also become possible to pinpoint joint attention occurrences that are fundamental for learning. In this study, we examined joint attention in collaborative mathematical problem solving. We studied the thought processes of four 15–16-year-old students in their regular classroom, using mobile gaze tracking, video and audio recordings, and smartpens. The four students worked as a group to find the shortest path to connect the vertices of a square. Combining information on the student gaze targets with a qualitative interpretation of the context, we identified the occurrences of joint attention, out of which 49 were joint visual attention occurrences and 28 were attention to different representations of the same mathematical idea. We call this joint representational attention. We discovered that ‘verifying’ (43%) and ‘watching and listening’ (35%) were the most common phases during joint attention. The most frequently occurring problem solving phases right after joint attention were also ‘verifying’ (47%) and ‘watching and listening’ (34%). We detected phase cycles commonly found in individual problem-solving processes (‘planning and exploring’, ‘implementing’, and ‘verifying’) outside of joint attention. We also detected phase shifts between ‘verifying’, ‘watching and listening’, and ‘understanding’ a problem, often occurring during joint attention. Therefore, these phases can be seen as a signal of successful interaction and the promotion of collaboration.