As we act on the world around us, our eyes seek out objects we plan to interact with. A growing body of evidence suggests that overt visual attention selects objects in the environment that could be interacted with, even when the task precludes physical interaction. Our previous work showed objects that afford grasping interactions influenced attention when static scenes depicted reachable spaces, and attention was otherwise better explained by general meaning (Rehrig, Peacock, et al., 2021). Because grasping is but one of many object interactions, our previous work may have downplayed the influence of object affordances on attention. The current study investigated the relationship between overt visual attention and object affordances versus broadly construed semantic information in scenes as speakers describe possible actions. In addition to meaning and grasp maps—which capture informativeness and grasping object affordances in scenes, respectively—we introduce interact maps, which capture affordances more broadly. In a mixed-effects analysis of 3 eyetracking experiments, interact map values predicted fixated regions in all experiments, whereas there was no main effect of meaning, and grasp maps marginally predicted fixated locations for scenes that depicted reachable spaces only. Our findings suggest speakers consistently allocate attention to scene regions that could be readily interacted with when describing the possible actions in a scene, while the other variants of semantic information tested (graspability and general meaning) have a compensatory or additive influence on attention. The current study clarifies the importance of object affordances in guiding visual attention in scenes.