2022
DOI: 10.1093/nargab/lqac025
|View full text |Cite
|
Sign up to set email alerts
|

Machine-learning of complex evolutionary signals improves classification of SNVs

Abstract: Conservation is a strong predictor for the pathogenicity of single-nucleotide variants (SNVs). However, some positions that present complex conservation patterns across vertebrates stray from this paradigm. Here, we analyzed the association between complex conservation patterns and the pathogenicity of SNVs in the 115 disease-genes that had sufficient variant data. We show that conservation is not a one-rule-fits-all solution since its accuracy highly depends on the analyzed set of species and genes. For examp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 75 publications
0
5
0
Order By: Relevance
“…Conservation patterns are strong predictors of pathogenicity for single nucleotide variants. EvoDiagnostics 3 detects these patterns using a machine-learning algorithm and accurately classifies CD36 single nucleotide variants (area under the receiver operating characteristic curve=0.93). Gly217Arg and Asp184Asn were found to have conservation patterns similar to known pathogenic variants.…”
mentioning
confidence: 99%
“…Conservation patterns are strong predictors of pathogenicity for single nucleotide variants. EvoDiagnostics 3 detects these patterns using a machine-learning algorithm and accurately classifies CD36 single nucleotide variants (area under the receiver operating characteristic curve=0.93). Gly217Arg and Asp184Asn were found to have conservation patterns similar to known pathogenic variants.…”
mentioning
confidence: 99%
“…Sequence-based techniques [ 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 ] are used to construct the most common tools for predicting the pathogenicity of genetic variations. For instance, REVEL [ 10 ] employs a random forest method based on ensemble methods with 13 pathogenicity predictors.…”
Section: Computational Prediction Of Pathogenic Variants Of Cancer Su...mentioning
confidence: 99%
“…CADD [ 12 ] is another ensemble method that integrates several scoring algorithms using a linear kernel support vector machine. VEST4 [ 8 ], EvoDiagnostics [ 18 ], and MetaSVM [ 9 ] are well-known random forest (RF) and support vector machine (SVM) prediction tools. Random forests and SVM may handle linear and non-linear data.…”
Section: Computational Prediction Of Pathogenic Variants Of Cancer Su...mentioning
confidence: 99%
“…Exploiting evolutionary data to detect functional regions in proteins and in nucleic acids is very commonly used (Capra et al 2009 ; del Sol Mesa et al 2003 ; Gallet et al 2000 ; Landgraf et al 2001 ; Lichtarge et al 1996a ; Lichtarge et al 1996b ; Lichtarge et al 1997 ; Valdar 2002 ). Evolutionary rates are often used in genomics analyses to predict the pathogenicity of single‐nucleotide variants identified in patient samples (Labes et al 2022 and references therein). They can also be used in protein engineering efforts (Pavelka et al 2009 ).…”
Section: Introductionmentioning
confidence: 99%