2022
DOI: 10.1101/2022.11.17.516880
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ECOLE: Learning to call copy number variants on whole exome sequencing data

Abstract: Copy number variants (CNV) are shown to contribute to the etiology of several genetic disorders. Accurate detection of CNVs on whole exome sequencing (WES) data has been a long sought after goal for use in clinic. This was not possible despite recent improvements in performance because algorithms mostly suffer from low precision and even lower recall on expert-curated gold standard call sets. Here, we present a deep learning-based somatic and germline CNV caller for WES data, named ECOLE. Based on a variant of… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 39 publications
0
1
0
Order By: Relevance
“…At the same time, enhancing the pipeline’s ability to detect CNVs (recall 60.53%, precision 72.72%, F1 score 66.07%), particularly by improving recall, is a pivotal area for future development. Strategies to achieve this may include refining existing detection algorithms, integrating additional CNV-specific quality metrics, and leveraging advanced computational techniques to better interpret complex genomic regions ( 44 ). Of note, CNV callers tends to be more challenging both because: (i) these variants are more difficult to accurately detect using short read sequencing data, which makes structural variants calling more error-prone than small variants calling, (ii) the precise breakpoints for CNVs are not always well defined, which makes comparison between call-sets more complex ( 45 ).…”
Section: Discussionmentioning
confidence: 99%
“…At the same time, enhancing the pipeline’s ability to detect CNVs (recall 60.53%, precision 72.72%, F1 score 66.07%), particularly by improving recall, is a pivotal area for future development. Strategies to achieve this may include refining existing detection algorithms, integrating additional CNV-specific quality metrics, and leveraging advanced computational techniques to better interpret complex genomic regions ( 44 ). Of note, CNV callers tends to be more challenging both because: (i) these variants are more difficult to accurately detect using short read sequencing data, which makes structural variants calling more error-prone than small variants calling, (ii) the precise breakpoints for CNVs are not always well defined, which makes comparison between call-sets more complex ( 45 ).…”
Section: Discussionmentioning
confidence: 99%