2018
DOI: 10.3389/fgene.2018.00297
|View full text |Cite
|
Sign up to set email alerts
|

Using Supervised Learning Methods for Gene Selection in RNA-Seq Case-Control Studies

Abstract: Whole transcriptome studies typically yield large amounts of data, with expression values for all genes or transcripts of the genome. The search for genes of interest in a particular study setting can thus be a daunting task, usually relying on automated computational methods. Moreover, most biological questions imply that such a search should be performed in a multivariate setting, to take into account the inter-genes relationships. Differential expression analysis commonly yields large lists of genes deemed … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
28
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
2
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 41 publications
(29 citation statements)
references
References 36 publications
1
28
0
Order By: Relevance
“…For this, we used a random forest approach, utilizing the transcriptome data from the mouse hippocampi we recently produced (see Methods) [57]. Machine-learning methods are progressively being applied to rank ensembles of genes defined by their expression values measured with RNA-seq [74]. Using importance measures generated by the random forest algorithm, we identified groups of 40 apoptosis-related genes and 10 proliferation-related genes that together differentiate between the adult Angelman syndrome model mice and the wild-type (WT) littermates.…”
Section: Mouse Brain Rna-seq Data Also Reveals Alterations In Apoptotmentioning
confidence: 99%
“…For this, we used a random forest approach, utilizing the transcriptome data from the mouse hippocampi we recently produced (see Methods) [57]. Machine-learning methods are progressively being applied to rank ensembles of genes defined by their expression values measured with RNA-seq [74]. Using importance measures generated by the random forest algorithm, we identified groups of 40 apoptosis-related genes and 10 proliferation-related genes that together differentiate between the adult Angelman syndrome model mice and the wild-type (WT) littermates.…”
Section: Mouse Brain Rna-seq Data Also Reveals Alterations In Apoptotmentioning
confidence: 99%
“…We tested this hypothesis by applying machine learning algorithms on two groups of heifers that were bred in 2015 (year one) and 2016 (year two). Parallel random forest emerged as the algorithm with over 90% efficiency of classification nearly all trials executed, which confirms the potential of accurate classification of samples using RNA-seq data under the case-control framework 60,61 . The results show that while not one single gene emerges as a potential biomarker, the accumulated information of transcript abundance from multiple genes can be powerful for the identification of fertility potential in cattle.…”
Section: Discussionmentioning
confidence: 53%
“…We tested this hypothesis by applying machine learning algorithms on two groups of heifers that were bred in 2015 (year one) and 2016 (year two). Parallel random forest emerged as the algorithm with over 90% efficiency of classification nearly all trials executed, which confirms the potential of accurate classification of samples using RNA-seq data under the case-control framework 69,70 . The results show that while not one single gene emerges as a potential biomarker, the accumulated information of transcript abundance from multiple genes can be powerful for the identification of fertility potential in cattle.…”
Section: Discussionmentioning
confidence: 54%