2023
DOI: 10.1073/pnas.2214069120
|View full text |Cite
|
Sign up to set email alerts
|

Classification of domains in predicted structures of the human proteome

Abstract: Recent advances in protein structure prediction have generated accurate structures of previously uncharacterized human proteins. Identifying domains in these predicted structures and classifying them into an evolutionary hierarchy can reveal biological insights. Here, we describe the detection and classification of domains from the human proteome. Our classification indicates that only 62% of residues are located in globular domains. We further classify these globular domains and observe that the majority (65%… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

4
29
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3
1

Relationship

4
4

Authors

Journals

citations
Cited by 24 publications
(33 citation statements)
references
References 65 publications
4
29
0
Order By: Relevance
“…Because of this difference, we more closely examined the wellassigned domains from those proteomes and enumerated the commonalities and differences among the and are more enriched in mammals and zebrafish. We found a similar enrichment of zinc-binding domains in our earlier classification of the predicted structures of the human proteome [27]. The EGF-like domain (Fig 3G), a small disulfide-bonded pair of β-sheets, is commonly an extracellular signaling factor but is implicated in a variety of roles in the extracellular environment, which likely explains their relative lack in bacteria and fungi.…”
Section: Commonalities and Differences Between The Classification Of ...supporting
confidence: 74%
“…Because of this difference, we more closely examined the wellassigned domains from those proteomes and enumerated the commonalities and differences among the and are more enriched in mammals and zebrafish. We found a similar enrichment of zinc-binding domains in our earlier classification of the predicted structures of the human proteome [27]. The EGF-like domain (Fig 3G), a small disulfide-bonded pair of β-sheets, is commonly an extracellular signaling factor but is implicated in a variety of roles in the extracellular environment, which likely explains their relative lack in bacteria and fungi.…”
Section: Commonalities and Differences Between The Classification Of ...supporting
confidence: 74%
“…A limitation of this study is that the target variation set must map to globular domains to allow interrogation of changes in chaperone binding across different variants. >60% of residues in the human proteome are found in globular domains 12 , thus our approach has broad applicability for evaluating structure-function and genotype-phenotype relationships in humans. One can also envision instances where decreased chaperone binding to variants is informative, such as by binding of a co-factor or compound that induces protein stabilization, or a mutation that switches an intrinsically disordered protein to a well-structured protein.…”
Section: Discussionmentioning
confidence: 99%
“…Protein folding is a fundamental process frequently disrupted by pathogenic mutations. Most proteins fold into specific three-dimensional structures encoded by the primary amino acid sequence [9][10][11][12] . Globular domains are the most well-characterized fundamental units of protein structure, which generally fold independently from the rest of the protein.…”
Section: Introductionmentioning
confidence: 99%
“…AlphaFold (AF), a recently developed deep learning method, demonstrated the capability to predict protein structure with atomic-level accuracy and has thus become an indispensable tool in structural biology (Jumper et al 2021). Utilizing AF models, ECOD became one of the first databases to incorporate domain classification both for the entire human proteome (Schaeffer et al 2023) and the whole proteomes of 48 model organisms (Schaeffer et al 2024). AlphaFold has significantly expanded the available tools for computational structural biology related to drug discovery, target prediction, protein-protein and protein-ligand interaction, and prediction of complex structures (Akdel et al 2022; Medvedev et al 2023a; Yang et al 2023).…”
Section: Introductionmentioning
confidence: 99%
“…In these cases, we applied the AlphaFill algorithm (Hekkelman et al 2023) that uses sequence and structure similarity to retrieve small molecules and ions from experimentally determined structures to predicted protein models by AlphaFold. Using AlphaFill models we identify residues whose atoms are located within 5Å of the DrugBank molecule’s atoms of interest (if present) and map these residues to ECOD domains identified for the whole human proteome using AlphaFold models (Schaeffer et al 2023; Schaeffer et al 2024). If the molecule of interest is not present in the AlphaFold model, all other present ligands are considered as such.…”
Section: Introductionmentioning
confidence: 99%