ImportanceBest-corrected visual acuity (BCVA) is a measure used to manage diabetic macular edema (DME), sometimes suggesting development of DME or consideration of initiating, repeating, withholding, or resuming treatment with anti–vascular endothelial growth factor. Using artificial intelligence (AI) to estimate BCVA from fundus images could help clinicians manage DME by reducing the personnel needed for refraction, the time presently required for assessing BCVA, or even the number of office visits if imaged remotely.ObjectiveTo evaluate the potential application of AI techniques for estimating BCVA from fundus photographs with and without ancillary information.Design, Setting, and ParticipantsDeidentified color fundus images taken after dilation were used post hoc to train AI systems to perform regression from image to BCVA and to evaluate resultant estimation errors. Participants were patients enrolled in the VISTA randomized clinical trial through 148 weeks wherein the study eye was treated with aflibercept or laser. The data from study participants included macular images, clinical information, and BCVA scores by trained examiners following protocol refraction and VA measurement on Early Treatment Diabetic Retinopathy Study (ETDRS) charts.Main OutcomesPrimary outcome was regression evaluated by mean absolute error (MAE); the secondary outcome included percentage of predictions within 10 letters, computed over the entire cohort as well as over subsets categorized by baseline BCVA, determined from baseline through the 148-week visit.ResultsAnalysis included 7185 macular color fundus images of the study and fellow eyes from 459 participants. Overall, the mean (SD) age was 62.2 (9.8) years, and 250 (54.5%) were male. The baseline BCVA score for the study eyes ranged from 73 to 24 letters (approximate Snellen equivalent 20/40 to 20/320). Using ResNet50 architecture, the MAE for the testing set (n = 641 images) was 9.66 (95% CI, 9.05-10.28); 33% of the values (95% CI, 30%-37%) were within 0 to 5 letters and 28% (95% CI, 25%-32%) within 6 to 10 letters. For BCVA of 100 letters or less but more than 80 letters (20/10 to 20/25, n = 161) and 80 letters or less but more than 55 letters (20/32 to 20/80, n = 309), the MAE was 8.84 letters (95% CI, 7.88-9.81) and 7.91 letters (95% CI, 7.28-8.53), respectively.Conclusions and RelevanceThis investigation suggests AI can estimate BCVA directly from fundus photographs in patients with DME, without refraction or subjective visual acuity measurements, often within 1 to 2 lines on an ETDRS chart, supporting this AI concept if additional improvements in estimates can be achieved.
Effective communication and control of a team of humans and robots is critical for a number DoD operations and scenarios. In an ideal case, humans would communicate with the robot teammates using nonverbal cues (i.e., gestures) that work reliably in a variety of austere environments and from different vantage points. A major challenge is that traditional gesture recognition algorithms using deep learning methods require large amounts of data to achieve robust performance across a variety of conditions. Our approach focuses on reducing the need for “hard-to-acquire” real data by using synthetically generated gestures in combination with synthetic-to-real domain adaptation techniques. We also apply the algorithms to improve the robustness and accuracy of gesture recognition from shifts in viewpoints (i.e., air to ground). Our approach leverages the soon-to-be released dataset called Robot Control Gestures (RoCoG-v2), consisting of corresponding real and synthetic videos from ground and aerial viewpoints. We first demonstrate real-time performance of the algorithm running on low-SWAP, edge hardware. Next, we demonstrate the ability to accurately classify gestures from different viewpoints with varying backgrounds representative of DoD environments. Finally, we show the ability to use the inferred gestures to control a team of Boston Dynamic Spot robots. This is accomplished using inferred gestures to control the formation of the robot team as well as to coordinate the robot’s behavior. Our expectation is that the domain adaptation techniques will significantly reduce the need for real-world data and improve gesture recognition robustness and accuracy using synthetic data.
Human action recognition is a challenging problem, particularly when there is high variability in factors such as subject appearance, backgrounds and viewpoint. While deep neural networks (DNNs) have been shown to perform well on action recognition tasks, they typically require large amounts of high-quality labeled data to achieve robust performance across a variety of conditions. Synthetic data has shown promise as a way to avoid the substantial costs and potential ethical concerns associated with collecting and labeling enormous amounts of data in the real-world. However, synthetic data may differ from real data in important ways. This phenomenon, known as domain shift, can limit the utility of synthetic data in robotics applications. To mitigate the effects of domain shift, substantial effort is being dedicated to the development of domain adaptation (DA) techniques. Yet, much remains to be understood about how best to develop these techniques. In this paper, we introduce a new dataset called Robot Control Gestures (RoCoG-v2). The dataset is composed of both real and synthetic videos from seven gesture classes, and is intended to support the study of synthetic-to-real domain shift for videobased action recognition. Our work expands upon existing datasets by focusing the action classes on gestures for humanrobot teaming, as well as by enabling investigation of domain shift in both ground and aerial views. We present baseline results using state-of-the-art action recognition and domain adaptation algorithms and offer initial insight on tackling the synthetic-to-real and ground-to-air domain shifts. Instructions on accessing the dataset can be found at https://github. com/reddyav1/RoCoG-v2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.