Uniform Manifold Approximation and Projection (UMAP) is a recently-published non-linear dimensionality reduction technique. Another such algorithm, t-SNE, has been the default method for such task in the past years. Herein we comment on the usefulness of UMAP high-dimensional cytometry and single-cell RNA sequencing, notably highlighting faster runtime and consistency, meaningful organization of cell clusters and preservation of continuums in UMAP compared to t-SNE. IntroductionThe last decades have witnessed a large increment in the number of parameters analysed in single cell cytometry studies. It currently reaches around 20 for flow-cytometry, 40 for masscytometry, and more than 20,000 in single-cell RNA-sequencing. In this context, dimensionality reduction techniques have been pivotal in enabling researchers to visualize high-dimensional data. While principal component analysis has historically been the main technique used for dimensionality reduction (DR), the recent years have highlighted the importance of non-linear DR techniques to avoid overcrowding issues. [3]). t-SNE is currently the most commonly-used technique and is efficient at highlighting local structure in the data, which for cytometry notably translates to the representation of cell populations as distinct clusters. t-SNE however suffers from limitations such as loss of large-scale information (the inter-cluster relationships), slow computation time and inability to meaningfully represent very large datasets [4]. A new algorithm, called Uniform Manifold Approximation and Projection (UMAP) has been recently published by McInnes and Healy[5]. They claim that compared to t-SNE it preserves as much of the local and more of the global data structure, with a shorter runtime. Since t-SNE has been extremely prevalent in the field of cytometry broadly encompassing flow and mass-cytometry as well as singlecell RNA-sequencing (scRNAseq), we tested these claims on well-characterized single-cell datasets [6][7][8]. We confirm that UMAP is an order of magnitude faster than t-SNE. In addition to this straightforward advantage, we argue that UMAP is not only able to create informative clusters, but is also able to organize these clusters in a meaningful way. We illustrate these claims by showing that UMAP can order clusters from T and NK cells from 8 human organs [7] in a way that both identifies major cell lineages but also a common axis that broadly recapitulates their differentiation stages. We also show that UMAP allows for an easier visualization of multibranched cellular trajectories by using a mass-cytometry[6] and a scRNAseq[8] datasets both recapitulating hematopoiesis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with đź’™ for researchers
Part of the Research Solutions Family.