2008
DOI: 10.1016/j.jmb.2007.12.076
|View full text |Cite
|
Sign up to set email alerts
|

Discrimination between Distant Homologs and Structural Analogs: Lessons from Manually Constructed, Reliable Data Sets

Abstract: A natural way to study protein sequence, structure, and function is to put them in the context of evolution. Homologs inherit similarities from their common ancestor, while analogs converge to similar structures due to a limited number of energetically favorable ways to pack secondary structural elements. Using novel strategies, we previously assembled two reliable databases of homologs and analogs. In this study, we compare these two data sets and develop a support vector machine (SVM)-based classifier to dis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
18
0
1

Year Published

2009
2009
2021
2021

Publication Types

Select...
9

Relationship

1
8

Authors

Journals

citations
Cited by 24 publications
(21 citation statements)
references
References 68 publications
2
18
0
1
Order By: Relevance
“…In agreement with trends described in a recent paper [36], the number of corresponding residues in the structural alignments between DUF structures assigned as putative analogs with their potential homologs tend to be smaller and the corresponding C α RMSD values of these pairs tend to be higher than the values for the other two categories of proteins assigned as putative homologs and recognizable homologs (see Figure 3C). However, the profiles of structural similarities in these three groups are very similar, suggesting that most proteins from the group of putative analogs may be, in fact, distant, but not readily recognizable, homologs of previously characterized protein families.…”
Section: Resultssupporting
confidence: 91%
“…In agreement with trends described in a recent paper [36], the number of corresponding residues in the structural alignments between DUF structures assigned as putative analogs with their potential homologs tend to be smaller and the corresponding C α RMSD values of these pairs tend to be higher than the values for the other two categories of proteins assigned as putative homologs and recognizable homologs (see Figure 3C). However, the profiles of structural similarities in these three groups are very similar, suggesting that most proteins from the group of putative analogs may be, in fact, distant, but not readily recognizable, homologs of previously characterized protein families.…”
Section: Resultssupporting
confidence: 91%
“…Our results suggest that the limit of sequence identity for successful WS-MR search is low enough to allow our method to extend to models that would otherwise be missed by methods that are based on sequence alignment for template selection (29). Both remote homologues and structural analogs (30) can be detected by WS-MR, with specific examples where models with an identity of 11.6% and an RMSD under 3 Å can be correctly placed and distinguished from negative results. We also show that low completeness with structure coverage of as little as 12% can be sufficient for good WS-MR template models, however in these cases high sequence identity and structural similarity for the covered area are required.…”
Section: Discussionmentioning
confidence: 88%
“…MALISAM contains 130 protein pairs and the two proteins in any pair are structural analogs with different SCOP37 folds. There is strong evidence indicating that proteins in a MALIDUP pair are not homologs38. Therefore, MALIDUP are the most challenging benchmark among these three.…”
Section: Resultsmentioning
confidence: 99%