2022
DOI: 10.1038/s41598-021-04260-1
|View full text |Cite
|
Sign up to set email alerts
|

Extracting phylogenetic dimensions of coevolution reveals hidden functional signals

Abstract: Despite the structural and functional information contained in the statistical coupling between pairs of residues in a protein, coevolution associated with function is often obscured by artifactual signals such as genetic drift, which shapes a protein’s phylogenetic history and gives rise to concurrent variation between protein sequences that is not driven by selection for function. Here, we introduce a background model for phylogenetic contributions of statistical coupling that separates the coevolution signa… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
16
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
2

Relationship

4
3

Authors

Journals

citations
Cited by 15 publications
(16 citation statements)
references
References 77 publications
0
16
0
Order By: Relevance
“…Separating coevolutionary signals encoding functional and structural constraints from phylogenetic correlations arising from historical contingency constitutes a key problem in analyzing the sequenceto-function mapping in proteins [15,18]. Phylogenetic correlations are known to obscure the identification of structural contacts by traditional coevolution methods, in particular by inferred Potts models [20,21,[41][42][43][44], motivating various corrections [17,21,22,24,[45][46][47][48]. From a theoretical point of view, disentangling these two types of signals is a fundamentally hard problem [49].…”
Section: Discussionmentioning
confidence: 99%
“…Separating coevolutionary signals encoding functional and structural constraints from phylogenetic correlations arising from historical contingency constitutes a key problem in analyzing the sequenceto-function mapping in proteins [15,18]. Phylogenetic correlations are known to obscure the identification of structural contacts by traditional coevolution methods, in particular by inferred Potts models [20,21,[41][42][43][44], motivating various corrections [17,21,22,24,[45][46][47][48]. From a theoretical point of view, disentangling these two types of signals is a fundamentally hard problem [49].…”
Section: Discussionmentioning
confidence: 99%
“…Thus, disentangling signals was much harder than in the minimal model, as the couplings from phylogeny make the model richer even in the absence of phylogeny in the data generation step. While this is a difficult problem, it could be partially addressed by applying phylogeny corrections to the inferred couplings [ 51 , 57 ]. This could also shed light on whether some of the useful signal from non-contact pairs is coming from collective functional constraints, similar to sectors in single proteins [ 29 , 57 , 85 ], an interesting possibility that was not explored here.…”
Section: Discussionmentioning
confidence: 99%
“…While this is a difficult problem, it could be partially addressed by applying phylogeny corrections to the inferred couplings [ 51 , 57 ]. This could also shed light on whether some of the useful signal from non-contact pairs is coming from collective functional constraints, similar to sectors in single proteins [ 29 , 57 , 85 ], an interesting possibility that was not explored here. Investigating the impact of imperfections in protein sequence alignment on partner inference would also be highly relevant, as well as including the possibility of one-to-many pairings and crosstalk.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, correlations from contacts and from phylogeny are both useful to predict protein-protein interaction partners among paralogs. This stands in contrast with the identification of structural contacts by DCA [5,6,30,[49][50][51], where phylogenetic correlations obscure structural ones, motivating the use of phylogeny corrections [52,53], such as the Average Product Correction [54,55], reweighting close sequences [6,7,55,56], and Nested Coevolution [57].…”
Section: Introductionmentioning
confidence: 99%