This paper provides a survey of the state-of-the-art information theoretic analysis for overlay multi-user (more than two pairs) cognitive networks and reports new capacity results. In an overlay scenario, cognitive / secondary users share the same frequency band with licensed / primary users to efficiently exploit the spectrum. They do so without degrading the performance of the incumbent users, and may possibly even aid in transmitting their messages as cognitive users are assumed to possess the message(s) of primary user(s) and possibly other cognitive user(s). The survey begins with a short overview of the two-user overlay cognitive interference channel. The evolution from two-user to three-user overlay cognitive interference channels is described next, followed by generalizations to multi-user (arbitrary number of users) cognitive networks. The rest of the paper considers K-user cognitive interference channels with different message knowledge structures at the transmitters.Novel capacity inner and outer bounds are proposed. Channel conditions under which the bounds meet, thus characterizing the information theoretic capacity of the channel, for both Linear Deterministic and Gaussian channel models, are derived.The results show that for certain channel conditions distributed cognition, or having a cumulative message knowledge structure at the nodes, may not be worth the overhead as (approximately) the same capacity can be achieved by having only one global cognitive user whose role is to manage all the interference in the network. The paper concludes with future research directions.In this paper, we survey the fundamental limits of communication for a multi-user overlay cognitive networks with an arbitrary number of secondary / cognitive user(s) having non-causal message knowledge of primary user(s). The users transmit in the same frequency band and thus in general interfere with one another. The performance metric considered is the information theoretic notion of channel capacity. In other words, we are interested in the maximum rate of communication for which arbitrarily small probability of error can be achieved by every user, which may be seen as a benchmark when building practical systems. We will focus on results for general memoryless and practically relevant additive white Gaussian noise (AWGN) [2] channel models, as well as high signal to noise ratio (SNR) approximations of Gaussian is currently an Associate Editor for the IEEE Transactions on Cognitive Communications and Networking. Her research focuses on multi-user information theory and applications to cognitive and software-defined radio, radar, relay and two-way communication networks.