The Hessian of a neural network captures parameter interactions through secondorder derivatives of the loss. It is a fundamental object of study, closely tied to various problems in deep learning, including model design, optimization, and generalization. Most prior work has been empirical, typically focusing on lowrank approximations and heuristics that are blind to the network structure. In contrast, we develop theoretical tools to analyze the range of the Hessian map, providing us with a precise understanding of its rank deficiency as well as the structural reasons behind it. This yields exact formulas and tight upper bounds for the Hessian rank of deep linear networks, allowing for an elegant interpretation in terms of rank deficiency. Moreover, we demonstrate that our bounds remain faithful as an estimate of the numerical Hessian rank, for a larger class of models such as rectified and hyperbolic tangent networks. Further, we also investigate the implications of model architecture (e.g. width, depth, bias) on the rank deficiency. Overall, our work provides novel insights into the source and extent of redundancy in overparameterized networks. * Detailed list of contributions are: Sidak first discovered that the Hessian rank formula, in an early form, holds experimentally to high fidelity, thus kick-starting the project. Sidak came up with the proof technique and proved Theorem 3, Theorem 5, Theorem 9, Theorem 12. Sidak wrote essentially the entire paper and noted the rank-deficiency interpretation. Gregor proved Lemma 8, assisted in a part of Theorem 3, and empirically observed the eventual formula for the Hessian rank. Gregor essentially ran all the experiments for the final submission