Christopher J. Ellison scite author profile

We show why the amount of information communicated between the past and future-the excess entropy-is not in general the amount of information stored in the present-the statistical complexity. This is a puzzle, and a long-standing one, since the latter is what is required for optimal prediction, but the former describes observed behavior. We layout a classification scheme for dynamical systems and stochastic processes that determines when these two quantities are the same or different. We do this by developing closed-form expressions for the excess entropy in terms of optimal causal predictors and retrodictors-the ǫ-machines of computational mechanics. A process's causal irreversibility and crypticity are key determining properties. Constructing a theory can be viewed as our attempt to extract from measurements a system's hidden organization. This suggests a parallel with cryptography whose goal [1] is to not reveal internal correlations within an encrypted data stream, even though it contains, in fact, a message. This is essentially the circumstance that confronts a scientist when building a model for the first time.In this view, the now-long history in nonlinear dynamics to reconstruct models from time series [2,3] concerns the case of self-decoding in which the information used to build a model is only that available in the observed process. That is, no "side-band" communication, prior knowledge, or disciplinary assumptions are allowed. Nature speaks for herself only through the data she willingly gives up.Here we show that the parallel is more than metaphor: building a model corresponds directly to decrypting the hidden state information in measurements. The results show why predicting and modeling are, at one and the same time, distinct and intimately related. Along the way, a number of persistent confusions about the role of (and different kinds of) information in prediction and modeling are clarified. We show how to measure the degree of hidden information and, along the way, identify a new kind of statistical irreversibility that plays a key role.Any process P( ← − X , − → X ) is a communication channel : It transmits information from the past ← − X = . . . X −3 X −2 X −1 to the future − → X = X 0 X 1 X 2 . . . by storing it in the present. Here X t is the random variable for the measurement outcome at time t. Our goal is also simply stated: We wish to predict the future using information from the past. At root, a prediction is probabilistic, specified by a distribution of possible futures − → X given a particular past ← − x : P( − → X | ← − x ). At a minimum, a good predictor needs to capture all of the information I shared between past and future: Consider now the goal of modeling-to build a representation that not only allows good prediction, but also expresses the mechanisms that produce a system's behavior. To build a model of a structured process (a channel), computational mechanics [5] introduced an equivalence relation ←

show abstract

Prediction, Retrodiction, and the Amount of Information Stored in the Present

Ellison

2009

View full text Add to dashboard Cite

We introduce an ambidextrous view of stochastic dynamical systems, comparing their forward-time and reverse-time representations and then integrating them into a single time-symmetric representation. The perspective is useful theoretically, computationally, and conceptually. Mathematically, we prove that the excess entropy-a familiar measure of organization in complex systems-is the mutual information not only between the past and future, but also between the predictive and retrodictive causal states. Practically, we exploit the connection between prediction and retrodiction to directly calculate the excess entropy. Conceptually, these lead one to discover new system measures for stochastic dynamical systems: crypticity (information accessibility) and causal irreversibility. Ultimately, we introduce a time-symmetric representation that unifies all of these quantities, compressing the two directional representations into one. The resulting compression offers a new conception of the amount of information stored in the present.

show abstract

Anatomy of a bit: Information in a time series observation

James¹,

Ellison²,

Crutchfield³

2011

124

179

View full text Add to dashboard Cite

Appealing to several multivariate information measures-some familiar, some new here-we analyze the information embedded in discrete-valued stochastic time series. We dissect the uncertainty of a single observation to demonstrate how the measures' asymptotic behavior sheds structural and semantic light on the generating process's internal information dynamics. The measures scale with the length of time window, which captures both intensive (rates of growth) and subextensive components. We provide interpretations for the components, developing explicit relationships between them. We also identify the informational component shared between the past and the future that is not contained in a single observation. The existence of this component directly motivates the notion of a process's effective (internal) states and indicates why one must build models.Keywords: entropy, total correlation, multivariate mutual information, binding information, entropy rate, predictive information rate A single measurement, when considered in the context of the past and the future, contains a wealth of information, including distinct kinds of information. Can the present measurement be predicted from the past? From the future? Or, only from them together? Or not at all? Is some of the measurement due to randomness? Does that randomness have consequences for the future or it is simply lost? We answer all of these questions and more, giving a complete dissection of a measured bit of information.

show abstract

Intersection Information Based on Common Randomness

Griffith

Chong

James

et al. 2014

Entropy

116

View full text Add to dashboard Cite

The introduction of the partial information decomposition generated a flurry of proposals for defining an intersection information that quantifies how much of "the same information" two or more random variables specify about a target random variable. As of yet, none is wholly satisfactory. A palatable measure of intersection information would provide a principled way to quantify slippery concepts, such as synergy. Here, we introduce an intersection information measure based on the Gács-Körner common random variable that is the first to satisfy the coveted target monotonicity property. Our measure is imperfect, too, and we suggest directions for improvement.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.