The split-attention effect refers to learning with related representations in multimedia. Spatial proximity and integration of these representations are crucial for learning processes. The influence of varying amounts of proximity between related and unrelated information has not yet been specified. In two experiments (N 1 = 98; N 2 = 85), spatial proximity between a pictorial presentation and text labels was manipulated (high vs. medium vs. low). Additionally, in Experiment 1, a control group with separated picture and text presentation was implemented. The results revealed a significant effect of spatial proximity on learning performance. In contrast to previous studies, the medium condition leads to the highest transfer, and in Experiment 2, the highest retention score. These results are interpreted considering cognitive load and instructional efficiency. Findings indicate that transfer efficiency is optimal at a medium distance between representations in Experiment 1. Implications regarding the spatial contiguity principle and the spatial contiguity failure are discussed.