The goal of many single-cell studies on eukaryotic cells is to gain insight into the biochemical reactions that control cell fate and state. In this paper we introduce the concept of effective stoichiometric space (ESS) to guide the reconstruction of biochemical networks from multiplexed, fixed time-point, single-cell data. In contrast to methods based solely on statistical models of data, the ESS method leverages the power of the geometric theory of toric varieties to begin unraveling the structure of chemical reaction networks (CRN). This application of toric theory enables a data-driven mapping of covariance relationships in single cell measurements into stoichiometric information, one in which each cell subpopulation has its associated ESS interpreted in terms of CRN theory. In the development of ESS we reframe certain aspects of the theory of CRN to better match data analysis. As an application of our approach we process cytomery-and image-based single-cell datasets and identify differences in cells treated with kinase inhibitors. Our approach is directly applicable to data acquired using readily accessible experimental methods such as Fluorescence Activated Cell Sorting (FACS) and multiplex immunofluorescence.
Author summaryWe introduce a new notion, which we call the effective stoichiometric space (ESS), that 1 elucidates network structure from the covariances of single-cell multiplexed data. The 2 ESS approach differs from methods that are based on purely statistical models of data: 3 it allows a completely new and data-driven translation of the theory of toric varieties in 4 geometry and specifically their role in chemical reaction networks (CRN). In the 5 process, we reframe certain aspects of the theory of CRN. As illustrations of our 6 approach, we find stoichiometry in different single-cell datasets, and pinpoint 7 dose-dependence of network perturbations in drug-treated cells. 8 Introduction 9Single-cell, multiplexed datasets have become prevalent [1,2], and include data on 10 transcript levels measured by sc-RNAseq [3], protein levels measured by flow 11 cytometry [4], or cell morphology and protein localization measured by multiplex 12 imaging [5-8]. An obvious advantage of such data is that it makes possible the detection 13 August 6, 2019 1/23 and quantification of differences among cells in a population, including those arising 14 from cyclic processes such as cell division and differentiation programs that are 15 asynchronous between cells [9, 10]. A more subtle advantage of single-cell data is that 16 they report on relationships among measured features, phosphorylation state of 17 receptors and nuclear localization of transcription factors for example. Because such 18 features are subject to natural stochastic fluctuation across a population of cells [11], 19 measuring the extent of correlation between otherwise independently fluctuating 20 features makes it possible to infer the topologies of biological networks [12, 13]. 21 A wide variety of tools have been developed for visualization of s...