We formulate an equivalence between machine learning and the formulation of statistical data assimilation as used widely in physical and biological sciences. The correspondence is that layer number in a feedforward artificial network setting is the analog of time in the data assimilation setting. This connection has been noted in the machine learning literature. We add a perspective that expands on how methods from statistical physics and aspects of Lagrangian and Hamiltonian dynamics play a role in how networks can be trained and designed. Within the discussion of this equivalence, we show that adding more layers (making the network deeper) is analogous to adding temporal resolution in a data assimilation framework. Extending this equivalence to recurrent networks is also discussed. We explore how one can find a candidate for the global minimum of the cost functions in the machine learning context using a method from data assimilation. Calculations on simple models from both sides of the equivalence are reported. Also discussed is a framework in which the time or layer label is taken to be continuous, providing a differential equation, the Euler-Lagrange equation and its boundary conditions, as a necessary condition for a minimum of the cost function. This shows that the problem being solved is a two-point boundary value problem familiar in the discussion of variational methods. The use of continuous layers is denoted "deepest learning." These problems respect a symplectic symmetry in continuous layer phase space. Both Lagrangian versions and Hamiltonian versions of these problems are presented. Their well-studied implementation in a discrete time/layer, while respecting the symplectic structure, is addressed. The Hamiltonian version provides a direct rationale for backpropagation as a solution method for a certain two-point boundary value problem.
Engineering subcellular organization in microbes shows great promise in addressing bottlenecks in metabolic engineering efforts; however, rules guiding selection of an organization strategy or platform are lacking. Here, we study compartment morphology as a factor in mediating encapsulated pathway performance. Using the 1,2-propanediol utilization microcompartment (Pdu MCP) system from Salmonella enterica serovar Typhimurium LT2, we find that we can shift the morphology of this protein nanoreactor from polyhedral to tubular by removing vertex protein PduN. Analysis of the metabolic function between these Pdu microtubes (MTs) shows that they provide a diffusional barrier capable of shielding the cytosol from a toxic pathway intermediate, similar to native MCPs. However, kinetic modeling suggests that the different surface area to volume ratios of MCP and MT structures alters encapsulated pathway performance. Finally, we report a microscopy-based assay that permits rapid assessment of Pdu MT formation to enable future engineering efforts on these structures.
MCPs are unique, genetically encoded organelles used by many bacteria to survive in resource-limited environments. There is significant interest in understanding the biogenesis and function of these organelles, both as potential antibiotic targets in enteric pathogens and also as useful tools for overcoming metabolic engineering bottlenecks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.