With great biological interest in post-translational modifications (PTMs), various approaches have been introduced to identify PTMs using MS/MS. Recent developments for PTM identification have focused on an unrestrictive approach that searches MS/MS spectra for all known and possibly even unknown types of PTMs at once. However, the resulting expanded search space requires much longer search time and also increases the number of false positives (incorrect identifications) and false negatives (missed true identifications), thus creating a bottleneck in high throughput analysis. Here we introduce MODa, a novel "multi-blind" spectral alignment algorithm that allows for fast unrestrictive PTM searches with no limitation on the number of modifications per peptide while featuring over an order of magnitude speedup in relation to existing approaches. We demonstrate the sensitivity of MODa on human shotgun proteomics data where it reveals multiple mutations, a wide range of modifications (including glycosylation), and evidence for several putative novel modifications. Based on the reported findings, we argue that the efficiency and sensitivity of MODa make it the first unrestrictive search tool with the potential to fully replace conventional restrictive identification of proteomics mass spectrometry data. Molecular & Cellular Proteomics 11: 10.1074/mcp.M111.010199, 1-13, 2012. Post-translational modifications (PTMs)1 regulate protein function, localization, and interactions inside a cell (1). Hundreds of PTM types are known so far, and yet a lot more may remain to be discovered (2, 3). The identification of PTMs is critical to gaining insight into biological functions but remains a formidable challenge. Tandem mass spectrometry (MS/MS) has emerged as a powerful tool for rapid identification of PTMs (4, 5), which can be detected by PTM-related diagnostic mass shifts of fragment ions in MS/MS spectra. However accurate computational identification of modified peptides remains a difficult problem often addressed with restrictive approaches that require "guessed" lists of possible PTMs to be provided in advance (6 -8). Such an approach may overlook potentially important PTMs if they are not guessed in advance. In recent PTM identification algorithms, peptide sequence tag approaches have been proposed to search for more types of PTMs and to speed up the search (9 -12). A small set of short sequence tags (2-4 amino acids long) are derived from an MS/MS spectrum and used to screen for matching peptides in a protein database; possible modifications are then inferred from the difference between the precursor ion mass of the experimental spectrum and the theoretically calculated mass of the matched peptide.In contrast with restrictive approaches, unrestrictive or blind approaches search MS/MS spectra for all known and even possibly unknown types of PTMs at once and derive the list of modifications directly from MS/MS data (13-22). OpenSea (13) and SPIDER (14) compared de novo sequencing results with peptide sequences from a protei...
Redox-active cysteine, a highly reactive sulfhydryl, is one of the major targets of ROS. Formation of disulfide bonds and other oxidative derivatives of cysteine including sulfenic, sulfinic, and sulfonic acids, regulates the biological function of various proteins. We identified novel lowabundant cysteine modifications in cellular GAPDH purified on 2-dimensional gel electrophoresis (2D-PAGE) by employing selectively excluded mass screening analysis for nano ultraperformance liquid chromatography-electrospray-quadrupole-time of flight tandem mass spectrometry, in conjunction with MOD i and MODmap algorithm. We observed unexpected mass shifts (⌬m ؍ ؊16, ؊34, ؉64, ؉87, and ؉103 Da) at redox-active cysteine residue in cellular GAPDH purified on 2D-PAGE, in oxidized NDP kinase A, peroxiredoxin 6, and in various mitochondrial proteins. Mass differences of ؊16, ؊34, and ؉64 Da are presumed to reflect the conversion of cysteine to serine, dehydroalanine (DHA), and Cys-SO 2 -SH respectively. To determine the plausible pathways to the formation of these products, we prepared model compounds and examined the hydrolysis and hydration of thiosulfonate (Cys-S-SO 2 -Cys) either to DHA (⌬m ؍ ؊34 Da) or serine along with Cys-SO 2 -SH (⌬m ؍ ؉64 Da). We also detected acrylamide adducts of sulfenic and sulfinic acids (؉87 and ؉103 Da). These findings suggest that oxidations take place at redox-active cysteine residues in cellular proteins, with the formation of thiosulfonate, Cys-SO 2 -SH, and DHA, and conversion of cysteine to serine, in addition to sulfenic, sulfinic and sulfonic acids of reactive cysteine.
Cancer is driven by the acquisition of somatic DNA lesions. Distinguishing the early driver mutations from subsequent passenger mutations is key to molecular sub-typing of cancers, understanding cancer progression, and the discovery of novel biomarkers. The advances of genomics technologies (whole-genome exome, and transcript sequencing, collectively referred to as NGS(Next Gengeration Sequencing)) have fueled recent studies on somatic mutation discovery. However, the vision is challenged by the complexity, redundancy, and errors in genomic data, and the difficulty of investigating the proteome translated portion of aberrant genes using only genomic approaches. Combination of proteomic and genomic technologies are increasingly being employed. Various strategies have been employed to allow the usage of large scale NGS data for conventional MS/MS searches. This paper provides a discussion of applying different strategies relating to large database search, and FDR(False Discovery Rate) based error control, and their implication to cancer proteogenomics. Moreover, it extends and develops the idea of a unified genomic variant database that can be searched by any mass spectrometry sample. A total of 879 BAM files downloaded from TCGA repository were used to create a 4.34 GB unified FASTA database which contained 2, 787, 062 novel splice junctions, 38, 464 deletions, 1, 105 insertions, and 182, 302 substitutions. Proteomic data from a single ovarian carcinoma sample (439, 858 spectra) was searched against the database. By applying the most conservative FDR measure, we have identified 524 novel peptides and 65, 578 known peptides at 1% FDR threshold. The novel peptides include interesting examples of doubly mutated peptides, frame-shifts, and non-sample-recruited mutations, which emphasize the strength of our approach.
Identifying the sites of disulfide bonds in a protein is essential for thorough understanding of a protein's tertiary and quaternary structures and its biological functions. Disulfide linked peptides are usually identified indirectly by labeling free sulfhydryl groups with alkylating agents, followed by chemical reduction and mass spectral comparison or by detecting the expected masses of disulfide linked peptides on mass scan level. However, these approaches for determination of disulfide bonds become ambiguous when the protein is highly bridged and modified. For accurate identification of disulfide linked peptides, we present here an algorithmic solution for the analysis of tandem mass (MS/MS) spectra of disulfide bonded peptides under nonreducing condition. A new algorithm called "DBond" analyzes disulfide linked peptides based on specific features of disulfide bonds. To determine disulfide linked sites, DBond takes into account fragmentation patterns of disulfide linked peptides in nucleoside diphosphate kinase (NDPK) as a model protein, considering fragment ions including cysteine, cysteine thioaldehyde (-2 Da, C(T)), cysteine persulfide (+32 Da, C(S)) and dehydroalanine (-34 Da, C(Delta)). Using this algorithm, we successfully identified about a dozen novel disulfide bonds in a hexa EF-hand calcium binding protein secretagogin and in a methionine sulfoxide reductase. We believe that DBond, taking into account the disulfide bond fragmentation characteristics and post-translational modifications, offers a novel approach for automatic identification of unknown disulfide bonds and their sites in proteins from MS/MS spectra.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.