Motivation The composition and density of immune cells in the tumor microenvironment (TME) profoundly influence tumor progression and success of anti-cancer therapies. Flow cytometry, immunohistochemistry staining or single-cell sequencing are often unavailable such that we rely on computational methods to estimate the immune-cell composition from bulk RNA-sequencing (RNA-seq) data. Various methods have been proposed recently, yet their capabilities and limitations have not been evaluated systematically. A general guideline leading the research community through cell type deconvolution is missing. Results We developed a systematic approach for benchmarking such computational methods and assessed the accuracy of tools at estimating nine different immune- and stromal cells from bulk RNA-seq samples. We used a single-cell RNA-seq dataset of ∼11 000 cells from the TME to simulate bulk samples of known cell type proportions, and validated the results using independent, publicly available gold-standard estimates. This allowed us to analyze and condense the results of more than a hundred thousand predictions to provide an exhaustive evaluation across seven computational methods over nine cell types and ∼1800 samples from five simulated and real-world datasets. We demonstrate that computational deconvolution performs at high accuracy for well-defined cell-type signatures and propose how fuzzy cell-type signatures can be improved. We suggest that future efforts should be dedicated to refining cell population definitions and finding reliable signatures. Availability and implementation A snakemake pipeline to reproduce the benchmark is available at https://github.com/grst/immune_deconvolution_benchmark. An R package allows the community to perform integrated deconvolution using different methods (https://grst.github.io/immunedeconv). Supplementary information Supplementary data are available at Bioinformatics online.
BackgroundIn the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view), k-means, k-medoid, SCPS, AutoSOME, and native (Java) MCL.ResultsResults are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC) enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section.ConclusionsThe Cytoscape plugin clusterMaker provides a number of clustering algorithms and visualizations that can be used independently or in combination for analysis and visualization of biological data sets, and for confirming or generating hypotheses about biological function. Several of these visualizations and algorithms are only available to Cytoscape users through the clusterMaker plugin. clusterMaker is available via the Cytoscape plugin manager.
Plants are indispensable for life on earth and represent organisms of extreme biological diversity with unique molecular capabilities 1. Here, we present a quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plant Arabidopsis thaliana. It provides initial answers to how many genes exist as proteins (>18,000), where they are expressed, in which approximate quantities (>6 orders of magnitude dynamic range) and to what extent they are phosphorylated (>43,000 sites). We present examples for how the data may be used, for instance, to discover proteins translated from short open reading frames, to uncover sequence motifs involved in protein expression regulation, to identify tissue-specific protein complexes or phosphorylation-mediated signaling events to name a few. Interactive access to this unique resource for the plant community is provided via ProteomicsDB and ATHENA which include powerful bioinformatics tools to explore and characterize Arabidopsis proteins, their modifications and interplay. Main The plant model organism Arabidopsis thaliana (AT) has revolutionized our understanding of plant biology and influenced many other areas of the life sciences 1. Knowledge derived from Arabidopsis has also provided mechanistic understanding of important agronomic traits in crop species 2. The Arabidopsis genome was sequenced 20 years ago and hundreds of natural variants have since been analyzed at the genome and epigenome level 3,4. In contrast, the Arabidopsis proteome as the main executer of most biological processes is far less comprehensively characterized. To address this gap, we used state-of-the-art mass spectrometry and RNA sequencing (RNA-seq) to provide the first integrated proteomic, phosphoproteomic and transcriptomic atlas of Arabidopsis. Illustrated by selected examples, we show how this rich molecular resource can be used to explore the function of single proteins or entire pathways across multiple omics levels. Multi-omics atlas of Arabidopsis We generated an expression atlas covering, on average, 17,603 ± 1,317 transcripts, 14,430 ± 911 proteins and 14,689 ± 2,509 phosphorylation sites (p-sites) per tissue, using a reproducible biochemical and analytical approach (Fig. 1a,b; Extended Data Fig. 1a-c; Supplementary Data 1,2). In total, the protein expression data covers 18,210 of the 27,655 protein-coding genes (66%) annotated in Araport11 5. This is a substantial increase compared to the percentage of genes with protein level evidence reported in UniProt (27%) 6 and more than double the number of proteins identified in an earlier tissue proteome analysis 7 (Fig. 1c, Extended Data Fig. 1d-f). In addition, we report tissue-resolved quantitative evidence for a total of 43,903 p-sites making this study the most comprehensive single Arabidopsis phosphoproteome published to date (Fig. 1c). 47% of the expressed proteome was found to be phosphorylated in at least one instance, confirming earlier analyses of individual
Ion Mobility Spectrometry (IMS) is a widely used and ‘well-known’ technique of ion separation in gaseous phase based on the differences of ion mobilities under an electric field. All IMS instruments operate with an electric field that provides space separation, but some IMS instruments also operate with a drift gas flow which provides also a temporal separation. In this review we will summarize the current IMS instrumentation. IMS techniques have received an increased interest as new instrumentation has become available to be coupled with mass spectrometry (MS). For each of the eight types of IMS instruments reviewed it is mentioned whether they can be hyphenated with MS and whether they are commercially available. Finally, out of the described devices, the six most-consolidated ones are compared. The current review article is followed by a companion review article which details the IMS hyphenated techniques (mainly gas chromatography and mass spectrometry) and the factors that make the data from an IMS device change as function of device parameters and sampling conditions. These reviews will provide the reader with an insightful view of the main characteristics and aspects of the IMS technique.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.