The “Dairy Diary” is a user‐friendly web‐based dairy intake screener. The reliability and validity are unknown. We aimed to evaluate the screener in terms of test–retest reliability and comparative validity. In a diagnostic accuracy study, a purposefully recruited sample of 79 (age: 21.6 ± 3.8 years) undergraduate dietetics/nutrition students from three South African universities completed 3 non‐consecutive days of weighed food records (reference standard) within a seven‐day period (comparative validity), followed by two administrations, 2 weeks apart, of the screener (index test) (reliability). For the four dairy product serving scores (PSSs) and the summative dairy serving scores (DSSs) of the screener and the food records, t‐tests, correlations, Bland–Altman, Kappa, McNemar's, and diagnostic accuracy were determined. For reliability, mean PSSs and DSSs did not differ significantly (p > .05) between the screener administrations. The mean PSSs were strongly correlated: milk (r = .69; p < .001), maas (fermented milk) (r = .72; p < .001), yoghurt (r = .71; p < .001), cheese (r = .74; p < .001). For DSSs, Kappa was moderate (k = 0.45; p < .001). Non‐agreeing responses suggest symmetry (p = .334). For validity, the PSSs of the screener and food records were moderately correlated [milk (r = .30; p = .0129), yoghurt (r = .38; p < .001), cheese (r = .38; p < .001)], with k = 0.31 (p = .006) for DSS. Bland–Altman analyses showed acceptable agreement for DSSs (bias: −0.49; 95% CI: −0.7 to −0.3). Categorized DSSs had high sensitivity (81.4%) and positive predictive value (93.4%), yet low specificity (55.6%) and negative predictive value (27.8%). The area under the receiver operating characteristic curve (0.7) was acceptable. The “Dairy Diary” is test–retest reliable with moderate comparative validity to screen for dairy intake of nutrition‐literate consumers.