Music-reading research has not yet fully grasped the variety and roles of different cognitive mechanisms that underlie visual processing of music notation; instead, studies have often explored one factor at a time. Based on prior research, we identified three possible cognitive mechanisms regarding visual processing during music reading: symbol comprehension, visual anticipation, and symbol performance demands. We also summed up the eye-movement indicators of each mechanism. We then asked which of the three cognitive mechanisms were needed to explain how note symbols are visually processed during temporally controlled rhythm reading. In our eye-tracking study, twenty-nine participants performed simple rhythm-tapping tasks, in which the relative complexity of consecutive rhythm symbols was systematically varied. Eye-time span (i.e., “looking ahead”) and first-pass fixation time at target symbols were analyzed with linear mixed-effects modeling. As a result, the mechanisms symbol comprehension and visual anticipation found support in our empirical data, whereas evidence for symbol performance demands was more ambiguous. Future studies could continue from here by exploring the interplay of these and other possible mechanisms; in general, we argue that music-reading research should begin to emphasize the systematic creating and testing of cognitive models of eye movements in music reading.