Flow cytometric analysis allows rapid single cell interrogation of surface and intracellular determinants by measuring fluorescence intensity of fluorophore-conjugated reagents. The availability of new platforms, allowing detection of increasing numbers of cell surface markers, has challenged the traditional technique of identifying cell populations by manual gating and resulted in a growing need for the development of automated, high-dimensional analytical methods. We present a direct multivariate finite mixture modeling approach, using skew and heavy-tailed distributions, to address the complexities of flow cytometric analysis and to deal with high-dimensional cytometric data without the need for projection or transformation. We demonstrate its ability to detect rare populations, to model robustly in the presence of outliers and skew, and to perform the critical task of matching cell populations across samples that enables downstream analysis. This advance will facilitate the application of flow cytometry to new, complex biological and clinical problems.finite mixture model ͉ flow cytometry ͉ multivariate skew distribution F low cytometry transformed clinical immunology and hematology over 2 decades ago by allowing the rapid interrogation of cell surface determinants and, more recently, by enabling the analysis of intracellular events using fluorophore-conjugated antibodies or markers. Although flow cytometry initially allowed the investigation of only a single fluorophore, recent advances allow close to 20 parallel channels for monitoring different determinants (1-4). These advances have now surpassed our ability to interpret manually the resulting high-dimensional data and have led to growing interest and recent activity in the development of new computational tools and approaches (5-8).The difficulty in data analysis arises from the traditional technique of identifying discrete cell populations by manual gating, which is a labor-intensive process and varies by user experience. The initial computational packages for flow cytometric analyses focused largely on different preprocessing tasks such as data acquisition, normalization, and live cell gating. Besides visualization and transformation of flow cytometric data, useful tools such as Flowjo (www.flowjo.com) and the packages in BioConductor (www.bioconductor.org) (such as prada, flowCore, flowViz, flowUtils, and rflowcyt) allow some form of software-assisted gating and extraction of populations of interest. The operator subjectively demarcates a cell population while moving through successive 2-or 3-dimensional projections of the data. This process limits the reproducibility of data processing. A more fundamental problem is that this lower dimensional visualization hinders the identification of higher-dimensional features. Furthermore, current methods extract only a limited number of sample parameters, such as the mean fluorescence intensity of a cell population, which can lead to loss of critical information in defining the properties of a cell population....
Many genes are regulated as an innate part of the eukaryotic cell cycle, and a complex transcriptional network helps enable the cyclic behavior of dividing cells. This transcriptional network has been studied in Saccharomyces cerevisiae (budding yeast) and elsewhere. To provide more perspective on these regulatory mechanisms, we have used microarrays to measure gene expression through the cell cycle of Schizosaccharomyces pombe (fission yeast). The 750 genes with the most significant oscillations were identified and analyzed. There were two broad waves of cell cycle transcription, one in early/mid G2 phase, and the other near the G2/M transition. The early/mid G2 wave included many genes involved in ribosome biogenesis, possibly explaining the cell cycle oscillation in protein synthesis in S. pombe. The G2/M wave included at least three distinctly regulated clusters of genes: one large cluster including mitosis, mitotic exit, and cell separation functions, one small cluster dedicated to DNA replication, and another small cluster dedicated to cytokinesis and division. S. pombe cell cycle genes have relatively long, complex promoters containing groups of multiple DNA sequence motifs, often of two, three, or more different kinds. Many of the genes, transcription factors, and regulatory mechanisms are conserved between S. pombe and S. cerevisiae. Finally, we found preliminary evidence for a nearly genome-wide oscillation in gene expression: 2,000 or more genes undergo slight oscillations in expression as a function of the cell cycle, although whether this is adaptive, or incidental to other events in the cell, such as chromatin condensation, we do not know.
Skew-normal and skew-t distributions have proved to be useful for capturing skewness and kurtosis in data directly without transformation. Recently, finite mixtures of such distributions have been considered as a more general tool for handling heterogeneous data involving asymmetric behaviors across subpopulations. We consider such mixture models for both univariate as well as multivariate data. This allows robust modeling of high-dimensional multimodal and asymmetric data generated by popular biotechnological platforms such as flow cytometry. We develop Bayesian inference based on data augmentation and Markov chain Monte Carlo (MCMC) sampling. In addition to the latent allocations, data augmentation is based on a stochastic representation of the skew-normal distribution in terms of a random-effects model with truncated normal random effects. For finite mixtures of skew normals, this leads to a Gibbs sampling scheme that draws from standard densities only. This MCMC scheme is extended to mixtures of skew-t distributions based on representing the skew-t distribution as a scale mixture of skew normals. As an important application of our new method, we demonstrate how it provides a new computational framework for automated analysis of high-dimensional flow cytometric data. Using multivariate skew-normal and skew-t mixture models, we could model non-Gaussian cell populations rigorously and directly without transformation or projection to lower dimensions.
Flow cytometric analysis allows rapid single cell interrogation of surface and intracellular determinants by measuring fluorescence intensity of fluorophore-conjugated reagents. The availability of new platforms, allowing detection of increasing numbers of cell surface markers, has challenged the traditional technique of identifying cell populations by manual gating and resulted in a growing need for the development of automated, high-dimensional analytical methods. We present a direct multivariate finite mixture modeling approach, using skew and heavy-tailed distributions, to address the complexities of flow cytometric analysis and to deal with high-dimensional cytometric data without the need for projection or transformation. We demonstrate its ability to detect rare populations, to model robustly in the presence of outliers and skew, and to perform the critical task of matching cell populations across samples that enables downstream analysis. This advance will facilitate the application of flow cytometry to new, complex biological and clinical problems.finite mixture model ͉ flow cytometry ͉ multivariate skew distribution F low cytometry transformed clinical immunology and hematology over 2 decades ago by allowing the rapid interrogation of cell surface determinants and, more recently, by enabling the analysis of intracellular events using fluorophore-conjugated antibodies or markers. Although flow cytometry initially allowed the investigation of only a single fluorophore, recent advances allow close to 20 parallel channels for monitoring different determinants (1-4). These advances have now surpassed our ability to interpret manually the resulting high-dimensional data and have led to growing interest and recent activity in the development of new computational tools and approaches (5-8).The difficulty in data analysis arises from the traditional technique of identifying discrete cell populations by manual gating, which is a labor-intensive process and varies by user experience. The initial computational packages for flow cytometric analyses focused largely on different preprocessing tasks such as data acquisition, normalization, and live cell gating. Besides visualization and transformation of flow cytometric data, useful tools such as Flowjo (www.flowjo.com) and the packages in BioConductor (www.bioconductor.org) (such as prada, flowCore, flowViz, flowUtils, and rflowcyt) allow some form of software-assisted gating and extraction of populations of interest. The operator subjectively demarcates a cell population while moving through successive 2-or 3-dimensional projections of the data. This process limits the reproducibility of data processing. A more fundamental problem is that this lower dimensional visualization hinders the identification of higher-dimensional features. Furthermore, current methods extract only a limited number of sample parameters, such as the mean fluorescence intensity of a cell population, which can lead to loss of critical information in defining the properties of a cell population....
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.