We identify effective stochastic differential equations (SDEs) for coarse observables of fine-grained particle- or agent-based simulations; these SDEs then provide useful coarse surrogate models of the fine scale dynamics. We approximate the drift and diffusivity functions in these effective SDEs through neural networks, which can be thought of as effective stochastic ResNets. The loss function is inspired by, and embodies, the structure of established stochastic numerical integrators (here, Euler–Maruyama and Milstein); our approximations can thus benefit from backward error analysis of these underlying numerical schemes. They also lend themselves naturally to “physics-informed” gray-box identification when approximate coarse models, such as mean field equations, are available. Existing numerical integration schemes for Langevin-type equations and for stochastic partial differential equations can also be used for training; we demonstrate this on a stochastically forced oscillator and the stochastic wave equation. Our approach does not require long trajectories, works on scattered snapshot data, and is designed to naturally handle different time steps per snapshot. We consider both the case where the coarse collective observables are known in advance, as well as the case where they must be found in a data-driven manner.
It is shown that Machine Learning (ML) algorithms can usefully capture the effect of crystallization composition and conditions (inputs) on key microstructural characteristics (outputs) of faujasite type zeolites (structure types FAU, EMT, and their intergrowths), which are widely used zeolite catalysts and adsorbents. The utility of ML (in particular, Geometric Harmonics) toward learning input-output relationships of interest is demonstrated, and a comparison with Neural Networks and Gaussian Process Regression, as alternative approaches, is provided. Through ML, synthesis conditions were identified to enhance the Si/Al ratio of high purity FAU zeolite to the hitherto highest level (i.e., Si/Al = 3.5) achieved via direct (not seeded), and organic structure-directing-agent-free synthesis from sodium aluminosilicate sols. The analysis of the ML algorithms’ results offers the insight that reduced Na2O content is key to formulating FAU materials with high Si/Al ratio. An acid catalyst prepared by partial ion exchange of the high-Si/Al-ratio FAU (Si/Al = 3.5) exhibits improved proton reactivity (as well as specific activity, per unit mass of catalyst) in propane cracking and dehydrogenation compared to the catalyst prepared from the previously reported highest Si/Al ratio (Si/Al = 2.8).
We present a data-driven approach to characterizing nonidentifiability of a model’s parameters and illustrate it through dynamic as well as steady kinetic models. By employing Diffusion Maps and their extensions, we discover the minimal combinations of parameters required to characterize the output behavior of a chemical system: a set of effective parameters for the model. Furthermore, we introduce and use a Conformal Autoencoder Neural Network technique, as well as a kernel-based Jointly Smooth Function technique, to disentangle the redundant parameter combinations that do not affect the output behavior from the ones that do. We discuss the interpretability of our data-driven effective parameters, and demonstrate the utility of the approach both for behavior prediction and parameter estimation. In the latter task, it becomes important to describe level sets in parameter space that are consistent with a particular output behavior. We validate our approach on a model of multisite phosphorylation, where a reduced set of effective parameters (nonlinear combinations of the physical ones) has previously been established analytically.
We present an approach, based on learning an intrinsic data manifold, for the initialization of the internal state values of long short-term memory (LSTM) recurrent neural networks, ensuring consistency with the initial observed input data. Exploiting the generalized synchronization concept, we argue that the converged, “mature” internal states constitute a function on this learned manifold. The dimension of this manifold then dictates the length of observed input time series data required for consistent initialization. We illustrate our approach through a partially observed chemical model system, where initializing the internal LSTM states in this fashion yields visibly improved performance. Finally, we show that learning this data manifold enables the transformation of partially observed dynamics into fully observed ones, facilitating alternative identification paths for nonlinear dynamical systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.