The dominant view of protein structure-function is that an amino acid sequence specifies a (mostly) fixed three-dimensional (3-D) structure that is a prerequisite to protein function. In contrast to the dominant view, many proteins display functions requiring the disordered state. Our purpose here is to provide a catalogue of disorder-function relationships. The very important molecular details in each example can be obtained from the references provided or from several excellent reviews and commentaries (1-9).For ordered protein, the ensemble members all have the same time-averaged canonical set of Ramachandran angles along their backbones. For intrinsically disordered protein, the ensemble members have different (and typically dynamic) Ramachandran angles. Such disorder has been characterized by a variety of methods including x-ray crystallography, NMR spectroscopy, CD spectroscopy, and protease sensitivity to name several. Each of these methods has advantages and limitations that are discussed in more detail elsewhere (10). Although a few disordered proteins and regions have been characterized by several methods as noted below, it would be useful to have more examples with multiple methods of characterization.In attempts to discover generalities from the known disorder examples, we recently used bioinformatics coupled with data mining (11)(12)(13)(14)(15). The results suggested that thousands of natively disordered proteins exist, representing a very substantial fraction of the proteins in the commonly used sequence databases (13,16). From these and related database predictions and from a set of functionally important disordered proteins, Wright and Dyson (17) called for a reassessment of the view that 3-D structure is always a prerequisite to protein function.In this article, we discuss the following topics: 1. how common is intrinsic disorder?; 2. intrinsic disorder in vivo; 3. functional annotations for 90 proteins having physically characterized regions of disorder; 4. disordered regions without known function 5. a structurefunction proposal called "the protein trinity"; 6. the functional repertoires of ordered and disordered protein, and 7. the need for a Disordered Protein Database (DisProt) to complement the Protein Data Bank (PDB).How Common is Intrinsic Disorder? A series of predictors of natural disordered regions (PONDRs) have been developed using amino acid sequence as inputs and giving intrinsic order or disorder tendencies as outputs (11,14,15,18,19). The various PONDRs are distinguished by different training sets, by different data representations for their inputs, and by different machine learning models for their development.For PONDR VL-XT 1 , currently the best characterized of the PONDRs, only 6% of more than 900 non-homologous proteins spanning PDB gave false positive predictions of disorder ≥ 40 consecutive amino acids in length. Even this 6% may be an over-estimate of the false positive error rate, however, because many of these predicted disordered regions are involved in ligand bindin...
Reversible protein phosphorylation provides a major regulatory mechanism in eukaryotic cells. Due to the high variability of amino acid residues flanking a relatively limited number of experimentally identified phosphorylation sites, reliable prediction of such sites still remains an important issue. Here we report the development of a new web-based tool for the prediction of protein phosphorylation sites, DISPHOS (DISorder-enhanced PHOSphorylation predictor, http://www.ist.temple. edu/DISPHOS). We observed that amino acid compositions, sequence complexity, hydrophobicity, charge and other sequence attributes of regions adjacent to phosphorylation sites are very similar to those of intrinsically disordered protein regions. Thus, DISPHOS uses position-specific amino acid frequencies and disorder information to improve the discrimination between phosphorylation and non-phosphorylation sites. Based on the estimates of phosphorylation rates in various protein categories, the outputs of DISPHOS are adjusted in order to reduce the total number of misclassified residues. When tested on an equal number of phosphorylated and non-phosphorylated residues, the accuracy of DISPHOS reaches 76% for serine, 81% for threonine and 83% for tyrosine. The significant enrichment in disorder-promoting residues surrounding phosphorylation sites together with the results obtained by applying DISPHOS to various protein functional classes and proteomes, provide strong support for the hypothesis that protein phosphorylation predominantly occurs within intrinsically disordered protein regions.
SUMMARY Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ~14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ~30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a “broader” human interactome network than currently appreciated. The map also uncovers significant inter-connectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high quality interactome models will help “connect the dots” of the genomic revolution.
Summary Genomic structural variants (SVs) are abundant in humans, differing from other variation classes in extent, origin, and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (i.e., copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analyzing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.