Post-translational modifications (PTMs) occur on almost all proteins analyzed to date. The function of a modified protein is often strongly affected by these modifications and therefore increased knowledge about the potential PTMs of a target protein may increase our understanding of the molecular processes in which it takes part. High-throughput methods for the identification of PTMs are being developed, in particular within the fields of proteomics and mass spectrometry. However, these methods are still in their early stages, and it is indeed advantageous to cut down on the number of experimental steps by integrating computational approaches into the validation procedures. Many advanced methods for the prediction of PTMs exist and many are made publicly available. We describe our experiences with the development of prediction methods for phosphorylation and glycosylation sites and the development of PTM-specific databases. In addition, we discuss novel ideas for PTM visualization (exemplified by kinase landscapes) and improvements for prediction specificity (by using ESS--evolutionary stable sites). As an example, we present a new method for kinase-specific prediction of phosphorylation sites, NetPhosK, which extends our earlier and more general tool, NetPhos. The new server, NetPhosK, is made publicly available at the URL http://www.cbs.dtu.dk/services/NetPhosK/. The issues of underestimation, over-prediction and strategies for improving prediction specificity are also discussed.
Glycosylation is the most abundant and diverse posttranslational modification of proteins. While several types of glycosylation can be predicted by the protein sequence context, and substantial knowledge of these glycoproteomes is available, our knowledge of the GalNAc-type O-glycosylation is highly limited. This type of glycosylation is unique in being regulated by 20 polypeptide GalNActransferases attaching the initiating GalNAc monosaccharides to Ser and Thr (and likely some Tyr) residues. We have developed a genetic engineering approach using human cell lines to simplify O-glycosylation (SimpleCells) that enables proteome-wide discovery of O-glycan sites using 'bottom-up' ETD-based mass spectrometric analysis. We implemented this on 12 human cell lines from different organs, and present a first map of the human O-glycoproteome with almost 3000 glycosites in over 600 O-glycoproteins as well as an improved NetOGlyc4.0 model for prediction of O-glycosylation. The finding of unique subsets of O-glycoproteins in each cell line provides evidence that the O-glycoproteome is differentially regulated and dynamic. The greatly expanded view of the O-glycoproteome should facilitate the exploration of how site-specific O-glycosylation regulates protein function.
We report here the genome sequence of an ancient human. Obtained from ∼4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20×, we recover 79% of the diploid genome, an amount close to the practical limit of current sequencing technologies. We identify 353,151 high-confidence single-nucleotide polymorphisms (SNPs), of which 6.8% have not been reported previously. We estimate raw read contamination to be no higher than 0.8%. We use functional SNP assessment to assign possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence for a migration from Siberia into the New World some 5,500 years ago, independent of that giving rise to the modern Native Americans and Inuit.
We present a thorough analysis of nuclear export signals and a prediction server, which we have made publicly available. The machine learning prediction method is a significant improvement over the generally used consensus patterns. Nuclear export signals (NESs) are extremely important regulators of the subcellular location of proteins. This regulation has an impact on transcription and other nuclear processes, which are fundamental to the viability of the cell. NESs are studied in relation to cancer, the cell cycle, cell differentiation and other important aspects of molecular biology. Our conclusion from this analysis is that the most important properties of NESs are accessibility and flexibility allowing relevant proteins to interact with the signal. Furthermore, we show that not only the known hydrophobic residues are important in defining a nuclear export signals. We employ both neural networks and hidden Markov models in the prediction algorithm and verify the method on the most recently discovered NESs. The NES predictor (NetNES) is made available for general use at http://www.cbs.dtu.dk/.
O-GalNAc-glycosylation is one of the main types of glycosylation in mammalian cells. No consensus recognition sequence for the O-glycosyltransferases is known, making prediction methods necessary to bridge the gap between the large number of known protein sequences and the small number of proteins experimentally investigated with regard to glycosylation status. From O-GLYCBASE a total of 86 mammalian proteins experimentally investigated for in vivo O-GalNAc sites were extracted. Mammalian protein homolog comparisons showed that a glycosylated serine or threonine is less likely to be precisely conserved than a nonglycosylated one. The Protein Data Bank was analyzed for structural information, and 12 glycosylated structures were obtained. All positive sites were found in coil or turn regions. A method for predicting the location for mucin-type glycosylation sites was trained using a neural network approach. The best overall network used as input amino acid composition, averaged surface accessibility predictions together with substitution matrix profile encoding of the sequence. To improve prediction on isolated (single) sites, networks were trained on isolated sites only. The final method combines predictions from the best overall network and the best isolated site network; this prediction method correctly predicted 76% of the glycosylated residues and 93% of the nonglycosylated residues. NetOGlyc 3.1 can predict sites for completely new proteins without losing its performance. The fact that the sites could be predicted from averaged properties together with the fact that glycosylation sites are not precisely conserved indicates that mucin-type glycosylation in most cases is a bulk property and not a very site-specific one. NetOGlyc 3.1 is made available at www.cbs.dtu.dk/services/netoglyc.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.