Nonlinear dynamical systems with symmetries exhibit a rich variety of behaviors, including complex attractor-basin portraits and enhanced and suppressed bifurcations. Symmetry arguments provide a way to study these collective behaviors and to simplify their analysis. The Koopman operator is an infinite dimensional linear operator that fully captures a system's nonlinear dynamics through the linear evolution of functions of the state space. Importantly, in contrast with local linearization, it preserves a system's global nonlinear features. We demonstrate how the presence of symmetries affects the Koopman operator structure and its spectral properties. In fact, we show that symmetry considerations can also simplify finding the Koopman operator approximations using the extended and kernel dynamic mode decomposition methods (EDMD and kernel DMD). Specifically, representation theory allows us to demonstrate that an isotypic component basis induces block diagonal structure in operator approximations, revealing hidden organization. Practically, if the data is symmetric, the EDMD and kernel DMD methods can be modified to give more efficient computation of the Koopman operator approximation and its eigenvalues, eigenfunctions, and eigenmodes. Rounding out the development, we discuss the effect of measurement noise.
Extracting actionable insight from complex unlabeled scientific data is an open challenge and key to unlocking data-driven discovery in science. Complementary and alternative to supervised machine learning approaches, unsupervised physics-based methods based on behavior-driven theories hold great promise. Due to computational limitations, practical application on real-world domain science problems has lagged far behind theoretical development. However, powerful modern supercomputers provide the opportunity to narrow the gap between theory and practical application. We present our first step towards bridging this divide -DisCo -a high-performance distributed workflow for the behavior-driven local causal state theory. DisCo provides a scalable unsupervised physics-based representation learning method that decomposes spatiotemporal systems into their structurally relevant components, which are captured by the latent local causal state variables. Complex spatiotemporal systems are generally highly structured and organize around a lower-dimensional skeleton of coherent structures, and in several firsts we demonstrate the efficacy of DisCo in capturing such structures from observational and simulated scientific data. To the best of our knowledge, DisCo is also the first application software developed entirely in Python to scale to over 1000 machine nodes, providing good performance along with ensuring domain scientists' productivity. We developed scalable, performant methods optimized for Intel many-core processors that will be upstreamed to open-source Python library packages. Our capstone experiment, using newly developed DisCo workflow and libraries, performs unsupervised spacetime segmentation analysis of CAM5.1 climate simulation data, processing an unprecedented 89.5 TB in 6.6 minutes end-to-end using 1024 Intel Haswell nodes on the Cori supercomputer obtaining 91% weak-scaling and 64% strong-scaling efficiency. This enables us to achieve state-of-the-art unsupervised segmentation of coherent spatiotemporal structures in complex fluid flows.Recently, supervised DL techniques have been applied to address this problem [24], [25], [26] including one of the 2018 Gordon Bell award winners [27]. However, there is an immediate and daunting challenge for these supervised approaches: ground-truth labels do not exist for pixel-level identification of extreme weather events [21]. The DL models used in the above studies are trained using the automated heuristics of TECA [20] for proximate labels. While the results in [24] qualitatively show that DL can improve upon TECA, the results in [26] reach accuracy rates over 97%, essentially reproducing the output of TECA. The supervised learning paradigm of optimizing objective metrics (e.g. training and generalization error) breaks down here [8] since TECA is not ground truth and we do not know how to train a DL model to disagree with TECA in just the right way to get closer to "ground truth".
Coherent structures form spontaneously in nonlinear spatiotemporal systems and are found at all spatial scales in natural phenomena from laboratory hydrodynamic flows and chemical reactions to ocean, atmosphere, and planetary climate dynamics. Phenomenologically, they appear as key components that organize the macroscopic behaviors in such systems. Despite a century of effort, they have eluded rigorous analysis and empirical prediction, with progress being made only recently. As a step in this, we present a formal theory of coherent structures in fully discrete dynamical field theories. It builds on the notion of structure introduced by computational mechanics, generalizing it to a local spatiotemporal setting. The analysis' main tool employs the local causal states, which are used to uncover a system's hidden spatiotemporal symmetries and which identify coherent structures as spatially localized deviations from those symmetries. The approach is behavior-driven in the sense that it does not rely on directly analyzing spatiotemporal equations of motion, rather it considers only the spatiotemporal fields a system generates. As such, it offers an unsupervised approach to discover and describe coherent structures. We illustrate the approach by analyzing coherent structures generated by elementary cellular automata, comparing the results with an earlier, dynamic-invariant-set approach that decomposes fields into domains, particles, and particle interactions.
Only a subset of degrees of freedom are typically accessible or measurable in real-world systems. As a consequence, the proper setting for empirical modeling is that of partially-observed systems. Notably, data-driven models consistently outperform physics-based models for systems with few observable degrees of freedom; e.g., hydrological systems. Here, we provide an operator-theoretic explanation for this empirical success. To predict a partially-observed system’s future behavior with physics-based models, the missing degrees of freedom must be explicitly accounted for using data assimilation and model parametrization. Data-driven models, in contrast, employ delay-coordinate embeddings and their evolution under the Koopman operator to implicitly model the effects of the missing degrees of freedom. We describe in detail the statistical physics of partial observations underlying data-driven models using novel Maximum Entropy and Maximum Caliber measures. The resulting nonequilibrium Wiener projections applied to the Mori-Zwanzig formalism reveal how data-driven models may converge to the true dynamics of the observable degrees of freedom. Additionally, this framework shows how data-driven models infer the effects of unobserved degrees of freedom implicitly, in much the same way that physics models infer the effects explicitly. This provides a unified implicit-explicit modeling framework for predicting partially-observed systems, with hybrid physics-informed machine learning methods combining implicit and explicit aspects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.