Sparse
identification of nonlinear dynamics (SINDy) is a recent
nonlinear modeling technique that has demonstrated superior performance
in modeling complex time-series data in the form of first-order ordinary
differential equations (ODEs), which are explicit and continuous in
time. However, a crucial step in the SINDy algorithm involves estimating
the time derivative of the states from the discrete, measured data.
Therefore, the presence of noise can greatly deteriorate the performance
if it is not carefully considered and accounted for. In this work,
SINDy is used with ensemble learning, where multiple models are identified
to improve the overall/final nonlinear model’s performance.
Specifically, in the SINDy algorithm, a fraction of the library functions
considered for the ODE model representation are randomly dropped out
in each submodel to favor model sparsity and stability at the possible
risk of lowering the model accuracy. This trade-off is controlled
by manipulating the fraction of the library functions dropped out
and the total number of models generated, both of which are considered
as hyperparameters to be tuned in the proposed algorithm. Data from
open-loop simulations of a large-scale chemical plant are generated
using the well-known high-fidelity process simulator, Aspen Plus Dynamics,
and corrupted with substantial sensor noise to be implemented in the
newly proposed algorithm, dropout-SINDy. The dropout-SINDy models
obtained from training with the noisy data are then tested in open-loop
simulations to demonstrate accurate identification of the steady-state
and reasonably close transient behavior under a variety of initial
conditions and manipulated input values. Finally, the constructed
models are used in a Lyapunov-based model predictive controller to
control the large-scale Aspen process, meeting desired closed-loop
stability and performance specifications.