Complex biological functions are carried out by the interaction of genes and proteins.Uncovering the gene regulation network behind a function is one of the central themes in biology.Typically, it involves extensive experiments of genetics, biochemistry and molecular biology. In this paper, we show that much of the inference task can be accomplished by a deep neural network (DNN), a form of machine learning or artificial intelligence. Specifically, the DNN learns from the dynamics of the gene expression. The learnt DNN behaves like an accurate simulator of the system, on which one can perform in-silico experiments to reveal the underlying gene network. We demonstrate the method with two examples: biochemical adaptation and the gap-gene patterning in fruit fly embryogenesis. In the first example, the DNN can successfully find the two basic network motifs for adaptation -the negative feedback and the incoherent feed-forward. In the second and much more complex example, the DNN can accurately predict behaviors of essentially all the mutants. Furthermore, the regulation network it uncovers is strikingly similar to the one inferred from experiments. In doing so, we develop methods for deciphering the gene regulation network hidden in the DNN "black box". Our interpretable DNN approach should have broad applications in genotype-phenotype mapping.
SignificanceComplex biological functions are carried out by gene regulation networks. The mapping between gene network and function is a central theme in biology. The task usually involves extensive experiments with perturbations to the system (e.g. gene deletion). Here, we demonstrate that machine learning, or deep neural network (DNN), can help reveal the underlying gene regulation for a given function or phenotype with minimal perturbation data. Specifically, after training with wild-type gene expression dynamics data and a few mutant snapshots, the DNN learns to behave like an accurate simulator for the genetic system, which can be used to predict other mutants' behaviors. Furthermore, our DNN approach is biochemically interpretable, which helps 3 uncover possible gene regulatory mechanisms underlying the observed phenotypic behaviors.
IntroductionComplex biological functions are carried out by gene regulation networks. Uncovering the gene regulation network behind a function is one of the central themes in biology. Traditionally, this task usually involves extensive genetic and biochemical experiments with perturbations to the biological system. For example, in the classical gene knockout experiments, by observing the expression of gene a increasing (decreasing) when deleting gene b, one may infer that gene a is repressed (activated) by gene b. More recently, statistical and bioinformatical methods have been used to help mapping out genetic and protein interactions. These methods can be very powerful especially in analyzing high throughput experimental data and extracting information about correlations among the genes and proteins (1, 2). Another computational approach is...