Objective
Assess the impact of false positives (FPs), false negatives (FNs), fixation losses (FLs) and test duration (TD) on visual field (VF) reliability at different stages of glaucoma severity.
Participants
10,262 VFs from 1,538 eyes of 909 subjects with suspect or manifest glaucoma and ≥5 VF examinations.
Design
Retrospective.
Methods
Predicted mean deviation (MD) was calculated with multilevel modeling of longitudinal data (>5 non-initial VFs). Differences between predicted and observed MD (ΔMD) were calculated as a reliability measure. The impact of FPs, FNs, FLs and TD on ΔMD was assessed using multi-level modelling.
Main outcome measure
ΔMD associated with a 10% increment in FPs, FNs, and FLs, or a 1-minute change in test duration.
Results
FLs had little impact on ΔMD (<0.2 dB per 10% abnormal catch trials) and no level of FL produced ≥ 1 dB of ΔMD at any disease stage. FPs yielded greater than expected MD, with a 10% increment in abnormal catch trials associated with a ΔMD=0.42, 0.73, and 0.66 dB in mild (MD>−6 dB), moderate (−6≤MD<−12 dB), and severe (−12≤MD≤−20 dB) disease, respectively, up to 20% abnormal catch trials, and a ΔMD=1.57, 2.06, and 3.53 dB beyond 20% abnormal catch trials.
FNs generally produced observed MDs below expected MDs. FNs were minimally impactful up to 20% abnormal catch trials (ΔMD per 10% increment >−0.14 dB at all levels of severity). Beyond 20% abnormal catch trials, each 10% increment in abnormal FN catch trials was associated with a ΔMD=−1.27, −0.53, and −0.51 dB in mild, moderate and severe disease. |ΔMD|≥1dB occurred with 22% FPs and 26% FNs in early, 14% FPs and 34% FNs in moderate, and 16% FPs and 51% FNs in severe disease. A 1-minute increment in TD produced a ΔMD between −0.35 and −0.40 depending on disease severity.
Conclusion
FLs have little impact on test reliability in established glaucoma patients. FPs, and to a lesser extent FNs and test duration, impact reliability significantly. The impact of FP and FN varies with disease severity and over the range of abnormal catch trials. Based on our findings, we present evidence-based severity-specific standards for classifying VF reliability are presented for clinical or research applications.