Target-decoy approach (TDA) is the dominant strategy for false discovery rate (FDR) estimation in mass-spectrometry-based proteomics. One of its main applications is direct FDR estimation based on counting of decoy matches above a certain score threshold. The corresponding equations are widely employed for filtering of peptide or protein identifications. In this work we consider a probability model describing the filtering process and find that, when decoy counting is used for q value estimation and subsequent filtering, a correction has to be introduced into these common equations for TDA-based FDR estimation. We also discuss the scale of variance of false discovery proportion (FDP) and propose using confidence intervals for more conservative FDP estimation in shotgun proteomics. The necessity of both the correction and the use of confidence intervals is especially pronounced when filtering small sets (such as in proteogenomics experiments) and when using very low FDR thresholds.
Pairing light and heavy chains in monoclonal antibodies (mAbs) using top-down (TD) or middle-down (MD) mass spectrometry (MS) may complement the sequence information on single chains provided by highthroughput genomic sequencing and bottom-up proteomics, favoring the rational selection of drug candidates. The 50 kDa F(ab) subunits of mAbs are the smallest structural units that contain the required information on chain pairing. These subunits can be enzymatically produced from whole mAbs and interrogated in their intact form by TD/MD MS approaches. However, the high structural complexity of F(ab) subunits requires increased sensitivity of the modern TD/MD MS for a comprehensive structural analysis. To address this and similar challenges, we developed and applied a multiplexed TD/MD MS workflow based on spectral averaging of tandem mass spectra (MS/MS) across multiple liquid chromatography (LC)−MS/MS runs acquired in reduced or full profile mode using an Orbitrap Fourier transform mass spectrometer (FTMS). We first benchmark the workflow using myoglobin as a reference protein, and then validate it for the analysis of the 50 kDa F(ab) subunit of a therapeutic mAb, trastuzumab. Obtained results confirm the envisioned benefits in terms of increased signal-to-noise ratio of product ions from utilizing multiple LC−MS/MS runs for TD/MD protein analysis using mass spectral averaging. The workflow performance is compared with the earlier introduced multiplexed TD/MD MS workflow based on transient averaging in Orbitrap FTMS. For the latter, we also report on enabling absorption mode FT processing and demonstrate its comparable performance to the enhanced FT (eFT) spectral representation.
Mass spectrometry (MS)-based bottom-up proteomics (BUP) is currently the method of choice for large-scale identification and characterization of proteins present in complex samples, such as cell lysates, body fluids, or tissues. Technically, BUP relies on MS analysis of complex mixtures of small, <3 kDa, peptides resulting from whole proteome digestion. Because of the extremely high sample complexity, further developments of detection methods and sample preparation techniques are necessary. In recent years, a number of alternative approaches such as middle-down proteomics (MDP, addressing up to 15 kDa peptides) and top-down proteomics (TDP, addressing proteins exceeding 15 kDa) have been gaining particular interest. Here we report on the bioinformatics study of both common and less frequently employed digestion procedures for complex protein mixtures specifically targeting the MDP approach. The aim of this study was to maximize the yield of protein structure information from MS data by optimizing peptide size distribution and sequence specificity. We classified peptides into four categories based on molecular weight: 0.6-3 (classical BUP), 3-7 (extended BUP), 7-15 kDa (MDP), and >15 kDa (TDP). Because of instrumentation-related considerations, we first advocate for the extended BUP approach as the potential near-future improvement of BUP. Therefore, we chose to optimize the number of unique peptides in the 3-7 kDa range while maximizing the number of represented proteins. The present study considers human, yeast, and bacterial proteomes. Results of the study can be further used for designing extended BUP or MDP experimental workflows.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.