STUDY QUESTION
What is the present performance of artificial intelligence (AI) decision support during embryo selection compared to the standard embryo selection by embryologists?
SUMMARY ANSWER
AI consistently outperformed the clinical teams in all the studies focused on embryo morphology and clinical outcome prediction during embryo selection assessment.
WHAT IS KNOWN ALREADY
The ART success rate is ∼30%, with a worrying trend of increasing female age correlating with considerably worse results. As such, there have been ongoing efforts to address this low success rate through the development of new technologies. With the advent of AI, there is potential for machine learning to be applied in such a manner that areas limited by human subjectivity, such as embryo selection, can be enhanced through increased objectivity. Given the potential of AI to improve IVF success rates, it remains crucial to review the performance between AI and embryologists during embryo selection.
STUDY DESIGN, SIZE, DURATION
The search was done across PubMed, EMBASE, Ovid Medline, and IEEE Xplore from 1 June 2005 up to and including 7 January 2022. Included articles were also restricted to those written in English. Search terms utilized across all databases for the study were: (‘Artificial intelligence’ OR ‘Machine Learning’ OR ‘Deep learning’ OR ‘Neural network’) AND (‘IVF’ OR ‘in vitro fertili*’ OR ‘assisted reproductive techn*’ OR ‘embryo’), where the character ‘*’ refers the search engine to include any auto completion of the search term.
PARTICIPANTS/MATERIALS, SETTING, METHODS
A literature search was conducted for literature relating to AI applications to IVF. Primary outcomes of interest were accuracy, sensitivity, and specificity of the embryo morphology grade assessments and the likelihood of clinical outcomes, such as clinical pregnancy after IVF treatments. Risk of bias was assessed using the Modified Down and Black Checklist.
MAIN RESULTS AND THE ROLE OF CHANCE
Twenty articles were included in this review. There was no specific embryo assessment day across the studies—Day 1 until Day 5/6 of embryo development was investigated. The types of input for training AI algorithms were images and time-lapse (10/20), clinical information (6/20), and both images and clinical information (4/20). Each AI model demonstrated promise when compared to an embryologist’s visual assessment. On average, the models predicted the likelihood of successful clinical pregnancy with greater accuracy than clinical embryologists, signifying greater reliability when compared to human prediction. The AI models performed at a median accuracy of 75.5% (range 59–94%) on predicting embryo morphology grade. The correct prediction (Ground Truth) was defined through the use of embryo images according to post embryologists’ assessment following local respective guidelines. Using blind test datasets, the embryologists’ accuracy prediction was 65.4% (range 47–75%) with the same ground truth provided by the original local respective assessment. Similarly, AI models had a median accuracy of 77.8% (range 68–90%) in predicting clinical pregnancy through the use of patient clinical treatment information compared to 64% (range 58–76%) when performed by embryologists. When both images/time-lapse and clinical information inputs were combined, the median accuracy by the AI models was higher at 81.5% (range 67–98%), while clinical embryologists had a median accuracy of 51% (range 43–59%).
LIMITATIONS, REASONS FOR CAUTION
The findings of this review are based on studies that have not been prospectively evaluated in a clinical setting. Additionally, a fair comparison of all the studies were deemed unfeasible owing to the heterogeneity of the studies, development of the AI models, database employed and the study design and quality.
WIDER IMPLICATIONS OF THE FINDINGS
AI provides considerable promise to the IVF field and embryo selection. However, there needs to be a shift in developers’ perception of the clinical outcome from successful implantation towards ongoing pregnancy or live birth. Additionally, existing models focus on locally generated databases and many lack external validation.
STUDY FUNDING/COMPETING INTERESTS
This study was funded by Monash Data Future Institute. All authors have no conflicts of interest to declare.
REGISTRATION NUMBER
CRD42021256333