2017
DOI: 10.1371/journal.pone.0179314
|View full text |Cite
|
Sign up to set email alerts
|

Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties

Abstract: Single amino acid variations (SAVs) potentially alter biological functions, including causing diseases or natural differences between individuals. Identifying the relationship between a SAV and certain disease provides the starting point for understanding the underlying mechanisms of specific associations, and can help further prevention and diagnosis of inherited disease.We propose PredSAV, a computational method that can effectively predict how likely SAVs are to be associated with disease by incorporating g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
22
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 45 publications
(23 citation statements)
references
References 84 publications
1
22
0
Order By: Relevance
“…We first select three features from the Top-50 features as the initial feature combinations, which is similar to the process in HEP 31 . Then we add correlation features by using sequential forward selection (SFS) method 38 . In the SFS method, features from the Top-500 features are sequentially added to the initial feature combinations until the ranking criterion R c no longer increased.…”
Section: Methodsmentioning
confidence: 99%
“…We first select three features from the Top-50 features as the initial feature combinations, which is similar to the process in HEP 31 . Then we add correlation features by using sequential forward selection (SFS) method 38 . In the SFS method, features from the Top-500 features are sequentially added to the initial feature combinations until the ranking criterion R c no longer increased.…”
Section: Methodsmentioning
confidence: 99%
“…K-fold cross-validation (Chou and Zhang, 1995 ; Kohavi, 1995 ; Zhang et al, 2012a , b , 2015 ; Liu et al, 2015a ; Chen X. et al, 2016 ; Li et al, 2016 ; Luo et al, 2016 ; Chen et al, 2017b , 2018a , b ; Pan et al, 2017a ; Xu et al, 2017 ; He et al, 2018 ) is one of the widely used approach to examine the ability of prediction model, and other approaches: independent dataset test and jackknife test (Chou and Shen, 2008 ) are also used in many applications. To reduce the computational cost, 10-fold cross validation was used to examine each model for its effectiveness in identifying ncDNA sequences.…”
Section: Methodsmentioning
confidence: 99%
“…According to PSSM, 20 features were extracted using the following formula [ 27 , 28 , 29 ]: where indicates the average score of the amino acid residue at each position of the sequence S as mutated by the amino acid residue j during evolution.…”
Section: Methodsmentioning
confidence: 99%