Visual motion perception underpins behaviours ranging from navigation to depth perception and grasping. Our limited access to biological systems constrain our understanding of how motion is processed within the brain. Here we explore properties of motion perception in biological systems by training a neural network (‘MotionNetxy’) to estimate the velocity image sequences. The network recapitulates key characteristics of motion processing in biological brains, and we use our complete access to its structure explore and understand motion (mis)perception at the computational-, neural-, and perceptual-levels. First, we find that the network recapitulates the biological response to reverse-phi motion in terms of direction. We further find that it overestimates the speed of slow reverse-phi motion while underestimating the speed of fast reverse-phi motion because of the correlation between reverse-phi motion and the spatiotemporal receptive fields tuned to motion in opposite directions. Second, we find that the distribution of spatiotemporal tuning properties in the V1 and MT layers of the network are similar to those observed in biological systems. We then show that compared to MT units tuned to fast speeds, those tuned to slow speeds primarily receive input from V1 units tuned to high spatial frequency and low temporal frequency. Third, we find that there is a positive correlation between the pattern-motion and speed selectivity of MT units. Finally, we show that the network captures human underestimation of low coherence motion stimuli, and that this is due to pooling of noise and signal motion. These findings provide biologically plausible explanations for well-known phenomena, and produce concrete predictions for future psychophysical and neurophysiological experiments.