The field of neuroscience is facing an unprecedented expanse in the volume and diversity of available data. Traditionally, network models have provided key insights into the structure and function of the brain. With the advent of big data in neuroscience, both more sophisticated models capable of characterizing the increasing complexity of the data and novel methods of quantitative analysis are needed. Recently multilayer networks, a mathematical extension of traditional networks, have gained increasing popularity in neuroscience due to their ability to capture the full information of multi-model, multi-scale, spatiotemporal data sets. Here, we review multilayer networks and their applications in neuroscience, showing how incorporating the multilayer framework into network neuroscience analysis has uncovered previously hidden features of brain networks. We specifically highlight the use of multilayer networks to model disease, structure-function relationships, network evolution, and link multi-scale data. Finally, we close with a discussion of promising new directions of multilayer network neuroscience research and propose a modified definition of multilayer networks designed to unite and clarify the use of the multilayer formalism in describing real-world systems. some other measured unit [1,2]. Edges of the network represent the strength of connection between two units, and are typically chosen to measure either physical connections (structural networks) or statistical relationships between nodal dynamics (functional networks) [3]. The rich theory of networks has been successfully utilized in studying the brain by quantifying network structure though the calculation of descriptive and inferential network statistics which expose otherwise hidden phenomenon. Measures of centrality, degree distribution, clustering, small-worldness, and more [4,5,6] have been used to study disease, task, learning, behavior, and structure [7,3,8,9,2].The success of network theory in uncovering the complex organization of the human brain is not without limits, however, as traditional networks capture only a single mode of interaction between units.Recording technologies such as functional magnetic resonance imaging (fMRI), magnetoencephalography (MEG), and electroencephalogram (EEG) capture brain dynamics across time and across multiple frequency bands, and it is important to retain the information of the full frequency spectrum [10,11,12,13,14,15] or the full temporal profile [16,17,18,19] of such recordings. In addition to measuring functional interactions through fMRI or EEG, structural recording techniques such as diffusion weighted imaging (DWI) measure the presence and strength of physical connections between the various regions of the brain. The emergence of such increasingly large and multi-modal data sets therefore necessitates a quantitative model that is rich and flexible enough to both describe interactions between multiple scales and modalities and allow for meaningful analysis of the data to provide new and powerful ins...