Small proteins of up to ∼50 amino acids are an abundant class of biomolecules across all domains of life. Yet, due to the challenges inherent in their size, they are often missed in genome annotations, and are difficult to identify and characterize using standard experimental approaches. Consequently, we still know few small proteins even in well-studied prokaryotic model organisms. Mass spectrometry (MS) has great potential for the discovery, validation, and functional characterization of small proteins. However, standard MS approaches are poorly suited to the identification of both known and novel small proteins due to limitations at each step of a typical proteomics workflow, i.e., sample preparation, protease digestion, liquid chromatography, MS data acquisition, and data analysis. Here, we outline the major MS-based workflows and bioinformatic pipelines used for small protein discovery and validation. Special emphasis is placed on highlighting the adjustments required to improve detection and data quality for small proteins. We discuss both the unbiased detection of small proteins and the targeted analysis of small proteins of interest. Finally, we provide guidelines to prioritize novel small proteins, and an outlook on methods with particular potential to further improve comprehensive discovery and characterization of small proteins.
IMPORTANCE
Small proteins of up to ∼50 amino acids play important physiological roles across all domains of life. Mass spectrometry is an ideal approach to detect and characterize small proteins, but many aspects of standard mass spectrometry workflows are biased against small proteins due to their size. Here, we highlight applications of mass spectrometry to study small proteins, emphasizing modifications to standard workflows to optimize the detection of small proteins.