There is an increasing number of potential quantitative biomarkers that could allow for early assessment of treatment response or disease progression. However, measurements of such biomarkers are subject to random variability. Hence, differences of a biomarker in longitudinal measurements do not necessarily represent real change but might be caused by this random measurement variability. Before utilizing a quantitative biomarker in longitudinal studies, it is therefore essential to assess the measurement repeatability. Measurement repeatability obtained from test–retest studies can be quantified by the repeatability coefficient, which is then used in the subsequent longitudinal study to determine if a measured difference represents real change or is within the range of expected random measurement variability. The quality of the point estimate of the repeatability coefficient, therefore, directly governs the assessment quality of the longitudinal study. Repeatability coefficient estimation accuracy depends on the case number in the test–retest study, but despite its pivotal role, no comprehensive framework for sample size calculation of test–retest studies exists. To address this issue, we have established such a framework, which allows for flexible sample size calculation of test–retest studies, based upon newly introduced criteria concerning assessment quality in the longitudinal study. This also permits retrospective assessment of prior test–retest studies.