1Machine learning methods have proven invaluable for increasing the sensitivity of peptide de-2 tection in proteomics experiments. Most modern tools, such as Percolator and PeptideProphet, 3 use semi-supervised algorithms to learn models directly from the datasets that they analyze. 4 Although these methods are effective for many proteomics experiments, we suspected that they 5 may be suboptimal for experiments of smaller scale. In this work, we found that the power and 6 consistency of Percolator results was reduced as the size of the experiment was decreased. As 7 an alternative, we propose a different operating mode for Percolator: learn a model with Per-8 colator from a large dataset and use the learned model to evaluate the small-scale experiment. 9 We call this a "static modeling" approach, in contrast to Percolator's usual "dynamic model" 10 that is trained anew for each dataset. We applied this static modeling approach to two settings: 11 small, gel-based experiments and single-cell proteomics. In both cases, static models increased 12 the yield of detected peptides and eliminated the model-induced variability of the standard dy-13 namic approach. These results suggest that static models are a powerful tool for bringing the 14 full benefits of Percolator and other semi-supervised algorithms to small-scale experiments. 15 1 Introduction 16The assignment of peptide sequences to tandem mass spectra is a fundamental task in any pro-17 teomics experiment [1]. This task is most often performed by a database search algorithm, which 18 was first introduced with the SEQUEST search engine in 1994 [2]. Database search algorithms score 19 the theoretical mass spectra of peptides from a selected sequence database against the acquired 20 mass spectra from the experiment, yielding a set of peptide-spectrum matches (PSMs). Although 21 the score functions of individual search engines may differ greatly, the scores reported by each are 22 intended to reflect the quality of PSMs. 23 2 Machine learning strategies to re-score PSMs were first introduced due to their ability to inte-24 grate multiple, orthogonal scores and features from database search engines, thereby improving the 25 sensitivity of peptide detection. Two such methods were proposed simultaneously: one based on 26 linear discriminant analysis [3] and another that used support vector machine (SVM) models to a 27 similar end [4]. Critically, both methods were examples of supervised learning: they relied on fully 28 labeled datasets-where the correct and incorrect PSMs could be determined a priori -to train 29 their respective models. These trained models were then used to predict a new score for the PSMs 30 of a new dataset. We refer to the models used by these methods as static, meaning their parameters 31 do not change when the dataset is changed. Critically, the success of these static models depended 32 on the quality of the labeled dataset that was used for training and how well it reflected the datasets 33 that were subsequently analyzed. The origina...