The emerging diversity of single cell RNAseq datasets allows for the full transcriptional characterization of cell types across a wide variety of biological and clinical conditions. However, it is challenging to analyze them together, particularly when datasets are assayed with different technologies. Here, real biological differences are interspersed with technical differences. We present Harmony, an algorithm that projects cells into a shared embedding in which cells group by cell type rather than dataset-specific conditions. Harmony simultaneously accounts for multiple experimental and biological factors. In six analyses, we demonstrate the superior performance of Harmony to previously published algorithms. We show that Harmony requires dramatically fewer computational resources. It is the only currently available algorithm that makes the integration of ~10 6 cells feasible on a personal computer. We apply Harmony to PBMCs from datasets with large experimental differences, 5 studies of pancreatic islet cells, mouse embryogenesis datasets, and cross-modality spatial integration. Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:
The rapidly emerging diversity of single cell RNAseq datasets allows us to characterize the transcriptional behav-1 ior of cell types across a wide variety of biological and clinical conditions. With this comprehensive breadth comes a major 2 analytical challenge. The same cell type across tissues, from different donors, or in different disease states, may appear 3 to express different genes. A joint analysis of multiple datasets requires the integration of cells across diverse conditions. 4 This is particularly challenging when datasets are assayed with different technologies in which real biological differences 5 are interspersed with technical differences. We present Harmony, an algorithm that projects cells into a shared embedding 6 in which cells group by cell type rather than dataset-specific conditions. Unlike available single-cell integration methods, 7 Harmony can simultaneously account for multiple experimental and biological factors. We develop objective metrics to 8 evaluate the quality of data integration. In four separate analyses, we demonstrate the superior performance of Harmony to 9 four single-cell-specific integration algorithms. Moreover, we show that Harmony requires dramatically fewer computational 10 resources. It is the only available algorithm that makes the integration of ∼ 10 6 cells feasible on a personal computer. We 11 demonstrate that Harmony identifies both broad populations and fine-grained subpopulations of PBMCs from datasets with 12 large experimental differences. In a meta-analysis of 14,746 cells from 5 studies of human pancreatic islet cells, Harmony 13 accounts for variation among technologies and donors to successfully align several rare subpopulations. In the resulting in-14 tegrated embedding, we identify a previously unidentified population of potentially dysfunctional alpha islet cells, enriched 15 for genes active in the Endoplasmic Reticulum (ER) stress response. The abundance of these alpha cells correlates across 16 donors with the proportion of dysfunctional beta cells also enriched in ER stress response genes. Harmony is a fast and 17 flexible general purpose integration algorithm that enables the identification of shared fine-grained subpopulations across a 18 variety of experimental and biological conditions. 19Recent technological advances 1 have enabled unbiased single cell transcriptional profiling of thousands of cells in a 20 single experiment. Projects such as the Human Cell Atlas 2 (HCA) and Accelerating Medicines Partnership 3, 4 exemplify 21 the growing body of reference datasets of primary human tissues. While individual experiments contribute incrementally 22 to our understanding of cell types, a comprehensive catalogue of healthy and diseased cells will require the integration of 23 multiple datasets across donors, studies, and technological platforms. Moreover, in translational research, joint analyses 24 across tissues and clinical conditions will be essential to identify disease expanded populations. However, meaningful 25 biological variatio...
Summary The identification of lymphocyte subsets with non-overlapping effector functions has been pivotal to the development of targeted therapies in immune mediated inflammatory diseases (IMIDs)1,2. However it remains unclear whether fibroblast subclasses with non-overlapping functions also exist and are responsible for the wide variety of tissue driven processes observed in IMIDs such as inflammation and damage3–5. Here we identify and describe the biology of distinct subsets of fibroblasts responsible for mediating either inflammation or tissue damage in arthritis. We show that deletion of FAPα+ fibroblasts suppressed both inflammation and bone erosions in murine models of resolving and persistent arthritis. Single cell transcriptional analysis identified two distinct fibroblast subsets within the FAPα+ population: FAPα+ THY1+ immune effector fibroblasts located in the synovial sub-lining, and FAPα+ THY1- destructive fibroblasts restricted to the synovial lining layer. When adoptively transferred into the joint, FAPα+ THY1- fibroblasts selectively mediate bone and cartilage damage with little effect on inflammation, whereas transfer of FAPα+ THY1+ fibroblasts resulted in a more severe and persistent inflammatory arthritis, with minimal effect on bone and cartilage. Our findings describing anatomically discrete, functionally distinct fibroblast subsets with non-overlapping functions have important implications for cell based therapies aimed at modulating inflammation and tissue damage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.