Target-selection tasks have been frequently conducted to evaluate novel input devices and pointing-facilitation techniques. Recently, Sharif et al. showed that a unifed metric to evaluate the pointing performance, called throughput TP, was not stable across two sessions performed by the same participant group, which indicates poor test-retest reliability. Because there are cases in which using TP is inappropriate depending on the research topic, we extend their fnding to two other metrics: movement time MT and error rate ER. We demonstrated that, even for the participants who kept their TPs across two sessions stable, they would exhibit unstable MT s and ERs. Thus, if time allows, researchers should design their experiments to run multiple sessions for obtaining the central tendency of user performance, which increases the validity of their user studies.
CCS CONCEPTS• Human-centered computing → HCI theory, concepts and models; Pointing.