Purpose of review
The current article provides an overview of the present approaches to algorithm validation, which are variable and largely self-determined, as well as solutions to address inadequacies.
Recent findings
In the last decade alone, numerous machine learning applications have been proposed for ophthalmic diagnosis or disease monitoring. Remarkably, of these, less than 15 have received regulatory approval for implementation into clinical practice. Although there exists a vast pool of structured and relatively clean datasets from which to develop and test algorithms in the computational ‘laboratory’, real-world validation remains key to allow for safe, equitable, and clinically reliable implementation. Bottlenecks in the validation process stem from a striking paucity of regulatory guidance surrounding safety and performance thresholds, lack of oversight on critical postdeployment monitoring and context-specific recalibration, and inherent complexities of heterogeneous disease states and clinical environments. Implementation of secure, third-party, unbiased, pre and postdeployment validation offers the potential to address existing shortfalls in the validation process.
Summary
Given the criticality of validation to the algorithm pipeline, there is an urgent need for developers, machine learning researchers, and end-user clinicians to devise a consensus approach, allowing for the rapid introduction of safe, equitable, and clinically valid machine learning implementations.