15We present ProteoClade, a Python toolkit that performs taxa-specific peptide assignment, protein infer-16 ence, and quantitation for multi-species proteomics experiments. ProteoClade scales to hundreds of 17 millions of protein sequences, requires minimal computational resources, and is open source, multi-18 platform, and accessible to non-programmers. We demonstrate its utility for processing quantitative 19 proteomic data derived from patient-derived xenografts and its speed and scalability enable a novel de 20 novo proteomic workflow for complex microbiota samples. 21
22Main Text: 23The goal of metaproteomic and multispecies proteomic studies is to characterize the proteomes of sam-24 ples containing multiple, comingled species, which can provide insight into the complex interactions at 25 the interface between organisms. Proteomic analysis of these samples can quantify thousands of pro-26 teins from hundreds of species in a single mass spectrometry experiment 1 , characterize education of 27 stromal tissue by patient-derived xenografts (PDXs) 2 , and extensively characterize the human oral mi-28 crobiome 3 . 29Metaproteomic and multispecies data analyses depend on the ability to integrate reference protein se-30 quence databases, taxonomic lineages, in silico proteolytic digestion, peptide identification, and quanti-31 tation. These studies universally perform bottom-up analysis, where proteins are digested into peptides 32 with a protease, and therefore require assignment of peptides to proteins based on their taxonomic 33 specificity. Several software tools provide one or more of these features, but have practical and tech-34 nical limitations that render them unable to facilitate complete analysis pipelines of quantitative prote-35 omics data and scale to the rapidly increasing number of available reference protein sequences 4,5 . With 36 regard to annotating peptides to taxa, Unipept is a commonly used taxonomic annotation tool that can 37 provide access to the entire UniProt sequence repository, provides web-based visualizations and a