2019
DOI: 10.1073/pnas.1814551116
|View full text |Cite
|
Sign up to set email alerts
|

Evolutionarily informed deep learning methods for predicting relative transcript abundance from DNA sequence

Abstract: Deep learning methodologies have revolutionized prediction in many fields and show potential to do the same in molecular biology and genetics. However, applying these methods in their current forms ignores evolutionary dependencies within biological systems and can result in false positives and spurious conclusions. We developed two approaches that account for evolutionary relatedness in machine learning models: (i) gene-family-guided splitting and (ii) ortholog contrasts. The first approach accounts for evolu… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
136
2

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
4

Relationship

2
8

Authors

Journals

citations
Cited by 151 publications
(140 citation statements)
references
References 44 publications
2
136
2
Order By: Relevance
“…At the same time, models trained using functional annotations currently assigned to only one or several homologous gene families may learn signatures of those gene families rather than the annotated function itself. In these cases, the accuracies calculated may be higher than the true values (Washburn et al, 2019). Going forward, there is a clear need for curated sets of experimentally supported functional annotations for maize equivalent to those previously generated for species such as yeast and arabidopsis (Aslett and Wood, 2006;Lamesch et al, 2011).…”
Section: Discussionmentioning
confidence: 99%
“…At the same time, models trained using functional annotations currently assigned to only one or several homologous gene families may learn signatures of those gene families rather than the annotated function itself. In these cases, the accuracies calculated may be higher than the true values (Washburn et al, 2019). Going forward, there is a clear need for curated sets of experimentally supported functional annotations for maize equivalent to those previously generated for species such as yeast and arabidopsis (Aslett and Wood, 2006;Lamesch et al, 2011).…”
Section: Discussionmentioning
confidence: 99%
“…Novel DL architectures are continuously developed (Angermueller et al , 2016; Ching et al , 2018; Lecun et al , 2015; Min et al, 2017; Tran et al , 2017), which includes deep neural networks (DNN), convolutional neural networks (CNNs), recurrent neural networks (RNNs) and auto‐encoders (Pérez‐Enciso and Zingaretti, 2019). There are multiple examples for applications of these newly developed architectures in plant biology (Gao et al , 2018; Ghosal et al , 2018; Wang et al , 2009; Washburn et al , 2019). Deep learning (DL) has met popularity in numerous applications dealing with raster‐based data (e.g.…”
Section: Panomics Platform and Systems Modelling For Germplasm Improvmentioning
confidence: 99%
“…based on genomic features 24 . Clearly, if not controlled for, the association between average gene expression and the odds of a gene being identified as differentially expressed would lead to a misleading estimate of prediction accuracy.…”
Section: Supervised Classification Algorithms Can Accurately Predictmentioning
confidence: 99%