Study question Can artificial intelligence (AI) algorithms reach expert-level accuracy in blastocyst morphology assessment according to Gardner criteria? Summary answer The prediction accuracy of the best performing AI algorithm (Deit), outperformed human-level mean accuracies compared to an embryologist majority vote for all Gardner morphological criteria. What is known already Routinely, morphological grading of blastocysts is performed visually according to Gardner criteria, which suggest expansion (EXP), quality of inner cell mass (ICM), and trophectoderm (TE) as key parameters to predict treatment outcome. Consequently, blastocyst scoring is prone to inter-and intra-observer variability, which may lead to inconsistencies in selecting blastocysts for transfer. AI-based algorithms may help to improve treatment outcome predictability, as it has been suggested recently. In those studies, parameters such as blastocyst quality or stage were annotated by experts from static or time-lapse-derived blastocyst images, to train AI algorithms, e.g. XCeption or YOLO, and compare them to human annotators. Study design, size, duration This retrospective study involves 2,270 images from 837 patients collected over a period of four years in a university IVF clinic. Participants/materials, setting, methods All images were annotated by one senior embryologist and divided into a training and a balanced test set. Subsequently, eight embryologists labeled 300 test set images such that every single image was seen by at least four embryologists. Annotators diverging from the ensemble vote for more than one standard deviation were excluded (n = 2) to set the ground truth labels. Finally, three AI architectures (XCeption, Swin, Deit) were trained and evaluated on that particular ground truth. Main results and the role of chance Out of nine annotators, labelling accuracy of two embryologists diverged from the consensus vote for more than one standard deviation for at least one of the three Gardner criteria. The consensus vote was built from the remaining seven annotators (mean accuracy EXP 0.81, ICM 0.70, TE 0.67). The Swin architecture outperformed the mean expert accuracy for all three criteria (EXP 0.82, ICM 0.76, TE 0.68), while the Deit and the XCeption architecture outperformed the mean expert accuracy in ICM accuracy (Deit 0.72, XCeption 0.73), and performed equal or worse in EXP and TE accuracy (Deit EXP 0.77, ICM 0.73; XCeption EXP 0.77, TE 0.66). When compared to a recent study conducted on time-lapse imaging data using AI algorithms, all our models outperform the ICM accuracy and achieve comparable TE accuracy. To minimize the role of chance in calculating the models' prediction accuracies, the SWA-Gaussian (SWAG) algorithm was used. SWAG is a method to reflect and calibrate uncertainty representation in Bayesian deep learning. It is based on modelling a Gaussian distribution for each networks' weight and applying it as a posterior over all neural network weights to perform Bayesian model averaging. Limitations, reasons for caution To reflect a real IVF lab scenario, embryologists of different origins and levels of experience were involved and no scoring training was offered to the participants. These facts could have potentially negatively affected the degree of consensus, although we excluded two annotators diverging from the mean labeling accuracy. Wider implications of the findings In the past, AI algorithms proved to reliably differentiate between good and bad prognosis blastocysts but not necessarily between blastocysts of similar quality. Further AI-supported differentiation on the basis of expansion and cell lineages will facilitate the ranking of blastocysts and would bring automated scoring closer to clinical application. Trial registration number Not applicable.
In the past few years, object detection has attracted a lot of attention in the context of human–robot collaboration and Industry 5.0 due to enormous quality improvements in deep learning technologies. In many applications, object detection models have to be able to quickly adapt to a changing environment, i.e., to learn new objects. A crucial but challenging prerequisite for this is the automatic generation of new training data which currently still limits the broad application of object detection methods in industrial manufacturing. In this work, we discuss how to adapt state-of-the-art object detection methods for the task of automatic bounding box annotation in a use case where the background is homogeneous and the object’s label is provided by a human. We compare an adapted version of Faster R-CNN and the Scaled-YOLOv4-p5 architecture and show that both can be trained to distinguish unknown objects from a complex but homogeneous background using only a small amount of training data. In contrast to most other state-of-the-art methods for bounding box labeling, our proposed method neither requires human verification, a predefined set of classes, nor a very large manually annotated dataset. Our method outperforms the state-of-the-art, transformer-based object discovery method LOST on our simple fruits dataset by large margins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.