2018
DOI: 10.1038/s41592-018-0138-4
|View full text |Cite
|
Sign up to set email alerts
|

Deep generative models of genetic variation capture the effects of mutations

Abstract: The functions of proteins and RNAs are defined by the collective interactions of many residues, and yet most statistical models of biological sequences consider sites nearly independently. Recent approaches have demonstrated benefits of including interactions to capture pairwise covariation, but leave higher-order dependencies out of reach. Here we show how it is possible to capture higher-order, context-dependent constraints in biological sequences via latent variable models with nonlinear dependencies. We fo… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

14
766
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 518 publications
(835 citation statements)
references
References 78 publications
14
766
1
Order By: Relevance
“…The autoregressive model is consistently able to predict the effects of mutations across a wide array of proteins and experimental assays (Figure 2a). When compared to other generative models trained on alignments of the same sequences, the autoregressive model is able to consistently match or outperform a model with only site-independent terms (30/40 datasets) and the EVmutation model with pairwise dependencies (30/40 datasets; Hopf et al, 2017); and it competitively matches the state-of-the-art results of DeepSequence (19/40 datasets; Riesselman et al, 2018).…”
Section: The Generative Model Predicts Experimental Mutation Effectsmentioning
confidence: 84%
See 3 more Smart Citations
“…The autoregressive model is consistently able to predict the effects of mutations across a wide array of proteins and experimental assays (Figure 2a). When compared to other generative models trained on alignments of the same sequences, the autoregressive model is able to consistently match or outperform a model with only site-independent terms (30/40 datasets) and the EVmutation model with pairwise dependencies (30/40 datasets; Hopf et al, 2017); and it competitively matches the state-of-the-art results of DeepSequence (19/40 datasets; Riesselman et al, 2018).…”
Section: The Generative Model Predicts Experimental Mutation Effectsmentioning
confidence: 84%
“…Mutation effects, sequence families, and previous effect predictions for validation were curated from published work (Riesselman et al, 2018). The naïve llama immune repertoire was acquired from McCoy et al (2014).…”
Section: Data Collectionmentioning
confidence: 99%
See 2 more Smart Citations
“…This imposes a great burden on experimental approaches aiming to design novel protein sequences, such as random mutagenesis 4 and recombination of naturally occurring homologous proteins 8,9 , as up to 70% of random amino acid substitutions typically result in a decline of protein activity and 50% are deleterious to protein function 4,[10][11][12][13][14][15][16] . On the other hand, Artificial intelligence (AI) is not limited by the amount of sequence variations it can process [17][18][19] and, instead of depending on a blind search process, is based on an inference-based one -it infers protein properties 18,20 and function 19,21 directly from training examples.…”
Section: Manuscriptmentioning
confidence: 99%