http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc/
Motivation: Knowing the localization of a protein within the cell helps elucidate its role in biological processes, its function and its potential as a drug target. Thus, subcellular localization prediction is an active research area. Numerous localization prediction systems are described in the literature; some focus on specific localizations or organisms, while others attempt to cover a wide range of localizations. Results: We introduce SherLoc, a new comprehensive system for predicting the localization of eukaryotic proteins. It integrates several types of sequence and text-based features. While applying the widely used support vector machines (SVMs), SherLoc's main novelty lies in the way in which it selects its text sources and features, and integrates those with sequence-based features. We test SherLoc on previously used datasets, as well as on a new set devised specifically to test its predictive power, and show that SherLoc consistently improves on previous reported results. We also report the results of applying SherLoc to a large set of yetunlocalized proteins.
Functional characterization of every single protein is a major challenge of the post-genomic era. The large-scale analysis of a cell’s proteins, proteomics, seeks to provide these proteins with reliable annotations regarding their interaction partners and functions in the cellular machinery. An important step on this way is to determine the subcellular localization of each protein. Eukaryotic cells are divided into subcellular compartments, or organelles. Transport across the membrane into the organelles is a highly regulated and complex cellular process. Predicting the subcellular localization by computational means has been an area of vivid activity during recent years. The publicly available prediction methods differ mainly in four aspects: the underlying biological motivation, the computational method used, localization coverage, and reliability, which are of importance to the user. This review provides a short description of the main events in the protein sorting process and an overview of the most commonly used methods in this field.
Summary Dual targeting of proteins to more than one subcellular localization has been found in animals, in fungi and in plants. In the latter, ambiguous N‐terminal targeting signals have been described that result in the protein being located in both mitochondria and plastids. We have developed ambiguous targeting predictor (ATP), a machine‐learning implementation that classifies such ambiguous targeting signals. Ambiguous targeting predictor is based on a support vector machine implementation that makes use of 12 different amino acid features. Prediction results were validated using fluorescent protein fusion. Both in silico and in vivo evaluations demonstrate that ambiguous targeting predictor is useful for predicting dual targeting to mitochondria and plastids. Proteins that are targeted to both organelles by tandemly arrayed signals (so‐called twin targeting) can be predicted by both ambiguous targeting predictor and a combination of single targeting prediction tools. Comparison of ambiguous targeting predictor with previous experimental approaches, as well as in silico approaches, shows good congruence. Based on the prediction results, land plant genomes are expected to encode, on average, > 400 proteins that are located in mitochondria and plastids. Ambiguous targeting predictor is helpful for functional genome annotation and can be used as a tool to further our understanding about dual protein targeting and its evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.