consensus-based disorder predictions, and for the first time comprehensively characterized intrinsic disorder at proteomic and protein levels from all significant perspectives, including abundance, cellular localization, functional roles, evolution, and impact on structural coverage. We show that intrinsic disorder is more abundant and has a unique profile in eukaryotes. We map disorder into archaea, bacterial and eukaryotic cells, and demonstrate that it is preferentially located in some cellular compartments. Functional analysis that considers over 1,200 annotations shows that certain functions are exclusively implemented by intrinsically disordered proteins and regions, and that some of them are specific to certain domains of life. We reveal that disordered regions are often targets for various post-translational modifications, but primarily in the eukaryotes and viruses. Using a phylogenetic tree for 14 eukaryotic and 112 bacterial species, we analyzed relations between disorder, sequence conservation and evolutionary speed. We provide a complete analysis that clearly shows that intrinsic disorder is exceptionally and uniquely abundant in each domain of life. Keywords Intrinsic disorder · Intrinsically disordered proteins · Intrinsically disordered regions · Cellular localization · Post-translational modifications · Evolutionary speed IntroductionIt is now recognized that in addition to globular, transmembrane and fibrillar proteins that are known to be characterized by unique three dimensional (3D)-structure, there is another tribe of proteins, which, being biologically functional, do not have unique 3D-structures in their native Abstract Recent years witnessed increased interest in intrinsically disordered proteins and regions. These proteins and regions are abundant and possess unique structural features and a broad functional repertoire that complements ordered proteins. However, modern studies on the abundance and functions of intrinsically disordered proteins and regions are relatively limited in size and scope of their analysis. To fill this gap, we performed a broad and detailed computational analysis of over 6 million proteins from 59 archaea, 471 bacterial, 110 eukaryotic and 325 viral proteomes. We used arguably more accurate Electronic supplementary material The online version of this article (doi:10.1007/s00018-014-1661-9) contains supplementary material, which is available to authorized users. 3states under the physiologic conditions in vitro and in vivo [1][2][3][4][5]. The members of this novel tribe are known as intrinsically disordered proteins (IDPs). Their structures are defined as highly dynamic ensembles of flexible conformations, where sampling of a large portion of a polypeptide's available conformational space is allowed. Although IDPs and intrinsically disordered regions (IDRs) in proteins are devoid of stable 3D-structures, they possess crucial biological functions and play multiple important roles in living organisms. In fact, the conformational plasticity associated with intrins...
Availability of computational methods that predict disorder from protein sequences fuels rapid advancements in the protein disorder field. The most accurate predictions are usually obtained with consensus-based approaches. However, their design is performed in an ad hoc manner. We perform first-of-its-kind rational design where we empirically search for an optimal mixture of base methods, selected out of a comprehensive set of 20 modern predictors, and we explore several novel ways to build the consensus. Our method for the prediction of disorder based on Consensus of Predictors (disCoP) combines seven base methods, utilizes custom-designed set of selected 11 features that aggregate base predictions over a sequence window and uses binomial deviance loss-based regression to implement the consensus. Empirical tests performed on an independent benchmark set (with low-sequence similarity compared with proteins used to design disCoP), shows that disCoP provides statistically significant improvements with at least moderate magnitude of differences. disCoP outperforms 28 predictors, including other state-of-the-art consensuses, and achieves Area Under the ROC Curve of .85 and Matthews Correlation Coefficient of .5 compared with .83 and .48 of the best considered approach, respectively. Our consensus provides high rate of correct disorder predictions, especially when low rate of incorrect disorder predictions is desired. We are first to comprehensively assess predictions in the context of several functional types of disorder and we demonstrate that disCoP generates accurate predictions of disorder located at the post-translational modification sites (in particular phosphorylation sites) and in autoregulatory and flexible linker regions. disCoP is available at http://biomine.ece.ualberta.ca/disCoP/.
Summary KIF1A-associated neurological disorder (KAND) encompasses a group of rare neurodegenerative conditions caused by variants in KIF1A ,a gene that encodes an anterograde neuronal microtubule (MT) motor protein. Here we characterize the natural history of KAND in 117 individuals using a combination of caregiver or self-reported medical history, a standardized measure of adaptive behavior, clinical records, and neuropathology. We developed a heuristic severity score using a weighted sum of common symptoms to assess disease severity. Focusing on 100 individuals, we compared the average clinical severity score for each variant with in silico predictions of deleteriousness and location in the protein. We found increased severity is strongly associated with variants occurring in protein regions involved with ATP and MT binding: the P loop, switch I, and switch II. For a subset of variants, we generated recombinant proteins, which we used to assess transport in vivo by assessing neurite tip accumulation and to assess MT binding, motor velocity, and processivity using total internal reflection fluorescence microscopy. We find all modeled variants result in defects in protein transport, and we describe three classes of protein dysfunction: reduced MT binding, reduced velocity and processivity, and increased non-motile rigor MT binding. The rigor phenotype is consistently associated with the most severe clinical phenotype, while reduced MT binding is associated with milder clinical phenotypes. Our findings suggest the clinical phenotypic heterogeneity in KAND likely reflects and parallels diverse molecular phenotypes. We propose a different way to describe KAND subtypes to better capture the breadth of disease severity.
Many viral proteins or their biologically important regions are disordered as a whole, or contain long disordered regions. These intrinsically disordered proteins/regions do not possess unique structures and possess functions that complement the functional repertoire of "normal" ordered proteins and domains, with many protein functional classes being heavily dependent on the intrinsic disorder. Viruses commonly use these highly flexible regions to invade the host organisms and to hijack various host systems. These disordered regions also help viruses in adapting to their hostile habitats and to manage their economic usage of genetic material. In this article, we focus on the structural peculiarities of proteins from human hepatitis C virus (HCV) and use a wide spectrum of bioinformatics techniques to evaluate the abundance of intrinsic disorder in the completed proteomes of several human HCV genotypes, to analyze the peculiarities of disorder distribution within the individual HCV proteins, and to establish potential roles of the structural disorder in functions of ten HCV proteins. We show that the intrinsic disorder or increased flexibility is not only abundant in these proteins, but is also absolutely necessary for their functions, playing a crucial role in the proteolytic processing of the HCV polyprotein, the maturation of the individual HCV proteins, and being related to the posttranslational modifications of these proteins and their interactions with DNA, RNA, and various host proteins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.