How is information distributed across large neuronal populations within a given brain area? One possibility is that information is distributed roughly evenly across neurons, so that total information scales linearly with the number of recorded neurons. Alternatively, the neural code might be highly redundant, meaning that total information saturates. Here we investigated how information about the direction of a moving visual stimulus is distributed across hundreds of simultaneously recorded neurons in mouse primary visual cortex (V1). We found that information scales sublinearly, due to the presence of correlated noise in these populations. Using recent theoretical advances, we compartmentalized noise correlations into informationlimiting and nonlimiting components, and then extrapolated to predict how information grows when neural populations are even larger. We predict that tens of thousands of neurons are required to encode 95% of the information about visual stimulus direction, a number much smaller than the number of neurons in V1. Overall, these findings suggest that the brain uses a widely distributed, but nonetheless redundant code that supports recovering most information from smaller subpopulations. Figure 1. Information scaling in large neural populations, and the impact of noise correlations on information.a. The information that a population of neurons can encode about some stimulus value is always a non-decreasing function of the population size. Information might on average increase with every added neuron (unbounded scaling; red) if the information is evenly distributed across all neurons. In contrast, information can rapidly saturate if information is redundant, and thus it is not strictly limited by population size, but by other factors. In general, it has only been possible to record from a very small subset of neurons of a particular area (grey shaded), from which it is hard to tell the difference between the two scenarios if the sampled population size is too small. b. The encoded information is modulated by noise correlations. This is illustrated using two neurons with different tunings to the stimulus value (top). The amount of information to discriminate between two stimulus values ( " /red and # /blue) depends on the difference in mean population activity (crosses) between stimuli, and the noise correlations (shaded ellipsoids) for either stimulus (bottom, showing joint neural activity of both neurons). The information is largest when the noise is smallest in the direction of the mean population activity difference (black arrow), which leads to the largest separation across the optimal discrimination boundary (grey line). In this example, positive correlations boost information (middle), whereas negative correlations lower it (right), when compared to uncorrelated neurons (left). In general, the impact of noise correlations depends on how they interact with the population's tuning curves.