Plants are indispensable for life on earth and represent organisms of extreme biological diversity with unique molecular capabilities 1. Here, we present a quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plant Arabidopsis thaliana. It provides initial answers to how many genes exist as proteins (>18,000), where they are expressed, in which approximate quantities (>6 orders of magnitude dynamic range) and to what extent they are phosphorylated (>43,000 sites). We present examples for how the data may be used, for instance, to discover proteins translated from short open reading frames, to uncover sequence motifs involved in protein expression regulation, to identify tissue-specific protein complexes or phosphorylation-mediated signaling events to name a few. Interactive access to this unique resource for the plant community is provided via ProteomicsDB and ATHENA which include powerful bioinformatics tools to explore and characterize Arabidopsis proteins, their modifications and interplay. Main The plant model organism Arabidopsis thaliana (AT) has revolutionized our understanding of plant biology and influenced many other areas of the life sciences 1. Knowledge derived from Arabidopsis has also provided mechanistic understanding of important agronomic traits in crop species 2. The Arabidopsis genome was sequenced 20 years ago and hundreds of natural variants have since been analyzed at the genome and epigenome level 3,4. In contrast, the Arabidopsis proteome as the main executer of most biological processes is far less comprehensively characterized. To address this gap, we used state-of-the-art mass spectrometry and RNA sequencing (RNA-seq) to provide the first integrated proteomic, phosphoproteomic and transcriptomic atlas of Arabidopsis. Illustrated by selected examples, we show how this rich molecular resource can be used to explore the function of single proteins or entire pathways across multiple omics levels. Multi-omics atlas of Arabidopsis We generated an expression atlas covering, on average, 17,603 ± 1,317 transcripts, 14,430 ± 911 proteins and 14,689 ± 2,509 phosphorylation sites (p-sites) per tissue, using a reproducible biochemical and analytical approach (Fig. 1a,b; Extended Data Fig. 1a-c; Supplementary Data 1,2). In total, the protein expression data covers 18,210 of the 27,655 protein-coding genes (66%) annotated in Araport11 5. This is a substantial increase compared to the percentage of genes with protein level evidence reported in UniProt (27%) 6 and more than double the number of proteins identified in an earlier tissue proteome analysis 7 (Fig. 1c, Extended Data Fig. 1d-f). In addition, we report tissue-resolved quantitative evidence for a total of 43,903 p-sites making this study the most comprehensive single Arabidopsis phosphoproteome published to date (Fig. 1c). 47% of the expressed proteome was found to be phosphorylated in at least one instance, confirming earlier analyses of individual
Characterizing the human leukocyte antigen (HLA) bound ligandome by mass spectrometry (MS) holds great promise for developing vaccines and drugs for immune-oncology. Still, the identification of non-tryptic peptides presents substantial computational challenges. To address these, we synthesized and analyzed >300,000 peptides by multi-modal LC-MS/MS within the ProteomeTools project representing HLA class I & II ligands and products of the proteases AspN and LysN. The resulting data enabled training of a single model using the deep learning framework Prosit, allowing the accurate prediction of fragment ion spectra for tryptic and non-tryptic peptides. Applying Prosit demonstrates that the identification of HLA peptides can be improved up to 7-fold, that 87% of the proposed proteasomally spliced HLA peptides may be incorrect and that dozens of additional immunogenic neo-epitopes can be identified from patient tumors in published data. Together, the provided peptides, spectra and computational tools substantially expand the analytical depth of immunopeptidomics workflows.
Single-cell profiling methods have had a profound impact on the understanding of cellular heterogeneity. While genomes and transcriptomes can be explored at the single-cell level, single-cell profiling of proteomes is not yet established. Here we describe new single-molecule protein sequencing and identification technologies alongside innovations in mass spectrometry that will eventually enable broad sequence coverage in single-cell profiling. These technologies will in turn facilitate biological discovery and open new avenues for ultrasensitive disease diagnostics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.