Advances in science are being sought in newly available opportunities to collect massive quantities of data about complex systems. While key advances are being made in detailed mapping of systems, how to relate this data to solving many of the challenges facing humanity is unclear. The questions we often wish to address require identifying the impact of interventions on the system and that impact is not apparent in the detailed data that is available. Here we review key concepts and motivate a general framework for building larger scale views of complex systems and for characterizing the importance of information in physical, biological and social systems. We provide examples of its application to evolutionary biology with relevance to ecology, biodiversity, pandemics, and human lifespan, and in the context of social systems with relevance to ethnic violence, global food prices, and stock market panic. Framing scientific inquiry as an effort to determine what is important and unimportant is a means for advancing our understanding and addressing many practical concerns, such as economic development or treating disease.
I. OVERVIEWChanging disease to health and economic instability to growth are among the complex challenges we face today. How can we turn the massive quantities of data that are increasingly available towards addressing these pressing problems? The data provide abundant detail, but generally carry no labels for guidance about which pieces of information are important for determining successful interventions. The questions we need to address are about properties of complex systems-human physiology, the global economy. Addressing questions about such systems requires disentangling the intricate dependencies and multiple causes and effects of behaviors, and recognizing that behaviors range across scales from microscopic to macroscopic.Here we argue that the key to addressing these questions is to focus on the way behavior at different scales are related, and how dependencies within a system lead to the large scale patterns of behavior that can be characterized directly without mapping all of the intricate details. The approach builds upon an understanding of how to aggregate component behaviors to identify larger scale behaviors, an approach developed in the "renormalization group" study of phase transitions in physics, and generalized here to multiscale information theory. In this framework, information itself has scale and larger scale information is the most important information to know, with progressively finer scale information only of importance to provide detail when necessary. This analysis focuses attention on information characterizing how to affect the largest scale behaviors of the system. The method is a shortcut for the infeasible effort of mapping all of the causes and effects that extend from molecular to global scales of biological and social systems. Specific causes and effects that need to be studied are only a few compared to the many that underly the system behavior at all scales....