Based on the principle of efficient coding, we present a theoretical framework for how to categorize the basic types of changes that can occur in a spatio-temporal signal. First, theoretical results for the problem of estimating multiple transparent motions are reviewed. Then, confidence measures for the presence of multiple motions are used to derive a basic alphabet of local signal variation that includes motion layers. To better understand and visualize this alphabet, a representation of motions in the projective plane is used. A further, practical contribution is an interactive tool that allows generating multiple motion patterns and displaying them in various apertures. In our framework, we can explain some well-known results on coherent motion and a few more complex perceptual phenomena such as the 2D-1D entrainment effect, but the focus of this paper is on the methods. Our working hypothesis is that efficient representations can be obtained by suppressing all the redundancies that arise if the visual input does not change in a particular direction, or a set of directions. Finally, we assume that human eye movements will tend to avoid the redundant parts of the visual input and report results where our framework has been used to obtain very good predictions of eye movements made on overlaid natural videos.