As we move through the world, we see the same visual scenes from different perspectives. Although we experience perspective deformations, our perception of a scene remains stable. This raises the question of which neuronal representations in visual brain areas are perspective-tuned and which are invariant. Focusing on planar rotations, we introduce a mathematical framework based on the principle of equivariance, which asserts that an image rotation results in a corresponding rotation of neuronal representations, to explain how the same representation can range from being fully tuned to fully invariant. We applied this framework to large-scale simultaneous neuronal recordings from four visual cortical areas in mice, where we found that representations are both tuned and invariant but become more invariant across higher-order areas. While common deep convolutional neural networks show similar trends in orientation-invariance across layers, they are not rotation-equivariant. We propose that equivariance is a prevalent computation of populations of biological neurons to gradually achieve invariance through structured tuning.