The rich variety of behaviors observed in animals arises through the complex interplay between sensory processing and motor control [1, 2, 3, 4, 5]. To understand these sensorimotor transformations, it is useful to build models that predict not only neural responses to sensory input [6, 7, 8, 9, 10] but also how each neuron causally contributes to behavior [11, 12]. Here we demonstrate a novel modeling approach to identify a one-to-one mapping between internal units in a deep neural network and real neurons by predicting the behavioral changes arising from systematic perturbations of more than a dozen neuron types. A key ingredient we introduce is “knockout training”, which involves perturbing the network during training to match the perturbations of the real neurons during behavioral experiments. We apply this approach to model the sensorimotor transformation of Drosophila melanogaster males during a complex, visually-guided social behavior [13, 14, 15, 16]. Contrary to prevailing views [17, 18, 19], our model suggests that visual projection neurons at the interface between the eye and brain form a distributed population code that collectively sculpts social behavior. Overall, our framework consolidates behavioral effects elicited from various neural perturbations into a single, unified model, providing a detailed map from stimulus to neuron to behavior.