Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.
SUMMARY The WAVE regulatory complex (WRC) controls actin cytoskeletal dynamics throughout the cell by stimulating the actin nucleating activity of the Arp2/3 complex at distinct membrane sites. However, the factors that recruit the WRC to specific locations remain poorly understood. Here we have identified a large family of potential WRC ligands, consisting of ~120 diverse membrane proteins including protocadherins, ROBOs, netrin receptors, Neuroligins, GPCRs and channels. Structural, biochemical and cellular studies reveal that a novel sequence motif that defines these ligands binds to a highly conserved interaction surface of the WRC formed by the Sra and Abi subunits. Mutating this binding surface in flies resulted in defects in actin cytoskeletal organization and egg morphology during oogenesis, leading to female sterility. Our findings directly link diverse membrane proteins to the WRC and actin cytoskeleton, and have broad physiological and pathological ramifications in metazoans.
The Critical Assessment of Protein Structure Prediction round 9 (CASP9) aimed to evaluate predictions for 129 experimentally determined protein structures. To assess tertiary structure predictions, these target structures were divided into domain-based evaluation units that were then classified into two assessment categories: template based modeling (TBM) and template free modeling (FM). CASP9 targets were split into domains of structurally compact evolutionary modules. For the targets with more than one defined domain, the decision to split structures into domains for evaluation was based on server performance. Target domains were categorized based on their evolutionary relatedness to existing templates as well as their difficulty levels indicated by server performance. Those target domains with sequence-related templates and high server prediction performance were classified as TMB, while those targets without identifiable templates and low server performance were classified as FM. However, using these generalizations for classification resulted in a blurred boundary between CASP9 assessment categories. Thus, the FM category included those domains without sequence detectable templates (25 target domains) as well as some domains with difficult to detect templates whose predictions were as poor as those without templates (5 target domains). Several interesting examples are discussed, including targets with sequence related templates that exhibit unusual structural differences, targets with homologous or analogous structure templates that are not detectable by sequence, and targets with new folds.
Results of the recent Critical Assessment of Techniques for Protein Structure Prediction, CASP8, present several valuable sources of information. First, CASP targets comprise a realistic sample of currently solved protein structures and exemplify the corresponding challenges for predictors. Second, the plethora of predictions by all possible methods provides an unusually rich material for evolutionary analysis of target proteins. Third, CASP results show the current state of the field and highlight specific problems in both predicting and assessing. Finally, these data can serve as grounds to develop and analyze methods for assessing prediction quality. Here we present results of our analysis in these areas. Our objective is not to duplicate CASP assessment, but to use our unique experience as former CASP5 assessors and CASP8 predictors to (i) offer more insights into CASP targets and predictions based on expert analysis, including invaluable analysis prior to target structure release; and (ii) develop an assessment methodology tailored towards current challenges in the field. Specifically, we discuss preparing target structures for assessment, parsing protein domains, balancing evaluations based on domains and on whole chains, dividing targets into categories and developing new evaluation scores. We also present evolutionary analysis of the most interesting and challenging targets.Database URL: Our results are available as a comprehensive database of targets and predictions at http://prodata.swmed.edu/CASP8.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.