Pupillometry is a promising method for assessing mental workload and could be helpful in the optimization of systems that involve human-computer interaction. The present study focuses on replicating the studies by Ahern (1978) and Klingner (2010), which found that for three levels of difficulty of mental multiplications, the more difficult multiplications yielded larger dilations of the pupil. Using a remote eye tracker, our research expands upon these two previous studies by statistically testing for each 1.5 s interval of the calculation period (1) the mean absolute pupil diameter (MPD), (2) the mean pupil diameter change (MPDC) with respect to the pupil diameter during the pre-stimulus accommodation period, and (3) the mean pupil diameter change rate (MPDCR). An additional novelty of our research is that we compared the pupil diameter measures with a self-report measure of workload, the NASA Task Load Index (NASA-TLX), and with the mean blink rate (MBR). The results showed that the findings of Ahern and Klingner were replicated, and that the MPD and MPDC discriminated just as well between the lowest and highest difficulty levels as did the NASA-TLX. The MBR, on the other hand, did not differentiate between the difficulty levels. Moderate to strong correlations were found between the MPDC and the proportion of incorrect responses, indicating that the MPDC was higher for participants with a poorer performance. For practical applications, validity could be improved by combining pupillometry with other physiological techniques.