Engineering biology relies on the accurate prediction
of cell responses.
However, making these predictions is challenging for a variety of
reasons, including the stochasticity of biochemical reactions, variability
between cells, and incomplete information about underlying biological
processes. Machine learning methods, which can model diverse input–output
relationships without requiring a priori mechanistic
knowledge, are an ideal tool for this task. For example, such approaches
can be used to predict gene expression dynamics given time-series
data of past expression history. To explore this application, we computationally
simulated single-cell responses, incorporating different sources of
noise and alternative genetic circuit designs. We showed that deep
neural networks trained on these simulated data were able to correctly
infer the underlying dynamics of a cell response even in the presence
of measurement noise and stochasticity in the biochemical reactions.
The training set size and the amount of past data provided as inputs
both affected prediction quality, with cascaded genetic circuits that
introduce delays requiring more past data. We also tested prediction
performance on a bistable auto-activation circuit, finding that our
initial method for predicting a single trajectory was fundamentally
ill-suited for multimodal dynamics. To address this, we updated the
network architecture to predict the entire distribution of future
states, showing it could accurately predict bimodal expression distributions.
Overall, these methods can be readily applied to the diverse prediction
tasks necessary to predict and control a variety of biological circuits,
a key aspect of many synthetic biology applications.