Copy number variants (CNVs) are a major cause of several genetic disorders, making their detection an essential component of genetic analysis pipelines. Current methods for detecting CNVs from exome-sequencing data are limited by high false-positive rates and low concordance because of inherent biases of individual algorithms. To overcome these issues, calls generated by two or more algorithms are often intersected using Venn diagram approaches to identify “high-confidence” CNVs. However, this approach is inadequate, because it misses potentially true calls that do not have consensus from multiple callers. Here, we present CN-Learn, a machine-learning framework that integrates calls from multiple CNV detection algorithms and learns to accurately identify true CNVs using caller-specific and genomic features from a small subset of validated CNVs. Using CNVs predicted by four exome-based CNV callers (CANOES, CODEX, XHMM, and CLAMMS) from 503 samples, we demonstrate that CN-Learn identifies true CNVs at higher precision (∼90%) and recall (∼85%) rates while maintaining robust performance even when trained with minimal data (∼30 samples). CN-Learn recovers twice as many CNVs compared to individual callers or Venn diagram–based approaches, with features such as exome capture probe count, caller concordance, and GC content providing the most discriminatory power. In fact, ∼58% of all true CNVs recovered by CN-Learn were either singletons or calls that lacked support from at least one caller. Our study underscores the limitations of current approaches for CNV identification and provides an effective method that yields high-quality CNVs for application in clinical diagnostics.
Purpose: Retinal toxicity resulting from hydroxychloroquine use manifests photoreceptor loss and disruption of the ellipsoid zone (EZ) reflectivity band detectable on spectral-domain (SD) OCT imaging. This study investigated whether an automatic deep learning-based algorithm can detect and quantitate EZ loss on SD OCT images with an accuracy comparable with that of human annotations.Design: Retrospective analysis of data acquired in a prospective, single-center, case-control study.Participants: Eighty-five patients (168 eyes) who were long-term hydroxychloroquine users (average exposure time, 14 AE 7.2 years).Methods: A mask region-based convolutional neural network (M-RCNN) was implemented and trained on individual OCT B-scans. Scan-by-scan detections were aggregated to produce an en face map of EZ loss per 3dimensional SD OCT volume image. To improve the accuracy and robustness of the EZ loss map, a dual network architecture was proposed that learns to detect EZ loss in parallel using horizontal (horizontal mask region-based convolutional neural network [M-RCNN H ]) and vertical (vertical mask region-based convolutional neural network [M-RCNN V ]) B-scans independently. To quantify accuracy, 10-fold cross-validation was performed.Main Outcome Measures: Precision, recall, intersection over union (IOU), F1-score metrics, and measured total EZ loss area were compared against human grader annotations and with the determination of toxicity based on the recommended screening guidelines.Results: The combined projection network demonstrated the best overall performance: precision, 0.90 AE 0.09; recall, 0.88 AE 0.08; and F1 score, 0.89 AE 0.07. The combined model performed superiorly to the M-RCNN H only model (precision, 0.79 AE 0.17; recall, 0.96 AE 0.04; IOU, 0.78 AE 0.15; and F1 score, 0.86 AE 0.12) and M-RCNN V only model (precision, 0.71 AE 0.21; recall, 0.94 AE 0.06; IOU, 0.69 AE 0.21; and F1 score, 0.79 AE 0.16). The accuracy was comparable with the variability of human experts: precision, 0.85 AE 0.09; recall, 0.98 AE 0.01; IOU, 0.82 AE 0.12; and F1 score, 0.91 AE 0.06. Automatically generated en face EZ loss maps provide quantitative SD OCT metrics for accurate toxicity determination combined with other functional testing.Conclusions: The algorithm can provide a fast, objective, automatic method for measuring areas with EZ loss and can serve as a quantitative assistance tool to screen patients for the presence and extent of toxicity.
23Copy-number variants (CNVs) are a major cause of several genetic disorders, making their 24 detection an essential component of genetic analysis pipelines. Current methods for detecting 25
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.