A major challenge in biotechnology and biomanufacturing is the identification of a set of biomarkers for perturbations and metabolites of interest. Here, we develop a data-driven, transcriptome-wide approach to rank perturbation-inducible genes from time-series RNA sequencing data for the discovery of analyte-responsive promoters. This provides a set of biomarkers that act as a proxy for the transcriptional state referred to as cell state. We construct low-dimensional models of gene expression dynamics and rank genes by their ability to capture the perturbation-specific cell state using a novel observability analysis. Using this ranking, we extract 15 analyte-responsive promoters for the organophosphate malathion in the underutilized host organism Pseudomonas fluorescens SBW25. We develop synthetic genetic reporters from each analyte-responsive promoter and characterize their response to malathion. Furthermore, we enhance malathion reporting through the aggregation of the response of individual reporters with a synthetic consortium approach, and we exemplify the library’s ability to be useful outside the lab by detecting malathion in the environment. The engineered host cell, a living malathion sensor, can be optimized for use in environmental diagnostics while the developed machine learning tool can be applied to discover perturbation-inducible gene expression systems in the compendium of host organisms.
Accelerating the design of synthetic biological circuits requires expanding the currently available genetic toolkit. Although whole-cell biosensors have been successfully engineered and deployed, particularly in applications such as environmental and medical diagnostics, novel sensing applications necessitate the discovery and optimization of novel biosensors. Here, we address this issue of the limited repertoire of biosensors by developing a data-driven, transcriptome-wide approach to discover perturbation-inducible genes from time-series RNA sequencing data, guiding the design of synthetic transcriptional reporters. By combining techniques from dynamical systems and control theory, we show that high-dimensional transcriptome dynamics can be efficiently represented and used to rank genes based on their ability to report the perturbation-specific cell state. We extract, construct, and validate 15 functional biosensors for the organophosphate malathion in the underutilized host organism Pseudomonas fluorescens SBW25, provide a computational approach to aggregate individual biosensor responses to facilitate enhanced reporting, and exemplify their ability to be useful outside the lab by detecting malathion in the environment. The library of living malathion sensors can be optimized for use in environmental diagnostics while the developed machine learning tool can be applied to discover perturbation-inducible gene expression systems in the compendium of host organisms.
In this paper, we consider the problem of learning a predictive model for population cell growth dynamics as a function of the media conditions. We first introduce a generic data-driven framework for training operator-theoretic models to predict cell growth rate. We then introduce the experimental design and data generated in this study, namely growth curves of Pseudomonas putida as a function of casein and glucose concentrations. We use a data driven approach for model identification, specifically the nonlinear autoregressive (NAR) model to represent the dynamics. We show theoretically that Hankel DMD can be used to obtain a solution of the NAR model. We show that it identifies a constrained NAR model and to obtain a more general solution, we define a causal state space system using 1-step, 2-step,..., τ -step predictors of the NAR model and identify a Koopman operator for this model using extended dynamic mode decomposition. The hybrid scheme we call causal-jump dynamic mode decomposition, which we illustrate on a growth profile or fitness prediction challenge as a function of different input growth conditions. We show that our model is able to recapitulate training growth curve data with 96.6% accuracy and predict test growth curve data with 91% accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.