Motivation
The use of post-processing tools to maximize the information gained from a proteomics search engine is widely accepted and used by the community, with the most notable example being Percolator—a semi-supervised machine learning model which learns a new scoring function for a given dataset. The usage of such tools is however bound to the search engine’s scoring scheme, which doesn’t always make full use of the intensity information present in a spectrum. We aim to show how this tool can be applied in such a way that maximizes the use of spectrum intensity information by leveraging another machine learning-based tool, MS2PIP. MS2PIP predicts fragment ion peak intensities.
Results
We show how comparing predicted intensities to annotated experimental spectra by calculating direct similarity metrics provides enough information for a tool such as Percolator to accurately separate two classes of peptide-to-spectrum matches. This approach allows using more information out of the data (compared with simpler intensity based metrics, like peak counting or explained intensities summing) while maintaining control of statistics such as the false discovery rate.
Availability and implementation
All of the code is available online at https://github.com/compomics/ms2rescore.
Supplementary information
Supplementary data are available at Bioinformatics online.
Rising population
density and global mobility are among the reasons
why pathogens such as SARS-CoV-2, the virus that causes COVID-19,
spread so rapidly across the globe. The policy response to such pandemics
will always have to include accurate monitoring of the spread, as
this provides one of the few alternatives to total lockdown. However,
COVID-19 diagnosis is currently performed almost exclusively by reverse
transcription polymerase chain reaction (RT-PCR). Although this is
efficient, automatable, and acceptably cheap, reliance on one type
of technology comes with serious caveats, as illustrated by recurring
reagent and test shortages. We therefore developed an alternative
diagnostic test that detects proteolytically digested SARS-CoV-2 proteins
using mass spectrometry (MS). We established the Cov-MS consortium,
consisting of 15 academic laboratories and several industrial partners
to increase applicability, accessibility, sensitivity, and robustness
of this kind of SARS-CoV-2 detection. This, in turn, gave rise to
the Cov-MS Digital Incubator that allows other laboratories to join
the effort, navigate, and share their optimizations and translate
the assay into their clinic. As this test relies on viral proteins
instead of RNA, it provides an orthogonal and complementary approach
to RT-PCR using other reagents that are relatively inexpensive and
widely available, as well as orthogonally skilled personnel and different
instruments. Data are available via ProteomeXchange with identifier
PXD022550.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.