2019
DOI: 10.1371/journal.pone.0225574
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning approach to single nucleotide polymorphism-based asthma prediction

Abstract: Machine learning (ML) is poised as a transformational approach uniquely positioned to discover the hidden biological interactions for better prediction and diagnosis of complex diseases. In this work, we integrated ML-based models for feature selection and classification to quantify the risk of individual susceptibility to asthma using single nucleotide polymorphism (SNP). Random forest (RF) and recursive feature elimination (RFE) algorithm were implemented to identify the SNPs with high implication to asthma.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 44 publications
(21 citation statements)
references
References 23 publications
1
20
0
Order By: Relevance
“…Support vector machine is among the robust ML algorithms used for classification and regression ( Ghandi et al, 2014 ; Gaudillo et al, 2019 ). It searches for the optimal hyperplane with maximized margins using support vectors for classification ( Ben-Hur et al, 2008 ).…”
Section: Methodsmentioning
confidence: 99%
“…Support vector machine is among the robust ML algorithms used for classification and regression ( Ghandi et al, 2014 ; Gaudillo et al, 2019 ). It searches for the optimal hyperplane with maximized margins using support vectors for classification ( Ben-Hur et al, 2008 ).…”
Section: Methodsmentioning
confidence: 99%
“…Machine learning (ML) is an innovative and powerful approach used in solving complex problems in various elds and disciplines due to its capability to handle and analyze high-dimensional datasets [22,23,24]. Several studies have already demonstrated the usability of ML in genomic datasets [25,26,27]; however, to our knowledge, there is only a handful of existing literature discussing its application to SNPset formation [28,29,30,31]. These studies employed cluster analysis to form SNP-sets in a data-driven manner.…”
Section: Introductionmentioning
confidence: 99%
“…For a more varied selection of SNPs to analyze, dimensionality reduction techniques based on random forest (RF) could be used to reduce dataset dimensions before conducting cluster analysis. RF has been widely incorporated in SNP research [25,34,35,36] due to its signi cant properties: (1) a nonparametric nature that allows the establishment of predictive models without the need for preliminary statistical assumptions, and (2) the capability to provide an importance score, i.e. variable importance measure (VIM) for each SNP, which increases the probability of detecting highly relevant biomarkers.…”
Section: Introductionmentioning
confidence: 99%
“…The advent of inexpensive whole genome sequencing methods in recent years has led to the creation of supervised machine learning approaches for predicting putative genetic variants from sequence data. Machine learning methods can effectively investigate the entire genome and provide insight into the subset of variants most likely to influence a particular phenotype, which is particularly useful for disorders or traits with complex, non-Mendelian inheritance patterns [1][2][3][4][5]. Since the high dimensionality of variant feature sets paired with a comparatively low number of training samples tends to result in model overfitting, feature selection methods are often used to narrow the genomic search space and improve model generalizability.…”
Section: Introductionmentioning
confidence: 99%