In everyday scenes, searched-for targets do not appear in isolation, but are embedded within configurations of non-target or distractor items. If the position of the target relative to the distractors is invariant, such spatial contingencies are implicitly learned and come to guide visual scanning ("contextual cueing"). However, the effectiveness of contextual cueing depends heavily on the consistency between bottom-up perceptual input and context memory:following configural learning, re-locating targets to an unexpected location within an unchanged distractor context completely abolishes contextual cueing, and gains deriving from the invariant context recover only very slowly with increasing exposure to the changed displays. The current study induces variations of the local target context, i.e., item density, to investigate the relation between this factor and contextual adaptation. The results showed that learned contextual cues can be adapted quickly if the target is re-positioned to a sparse local distractor context (consisting of 1 neighbouring non-target item), as compared to no adaptation with a dense context (with 3 surrounding non-targets). This suggests that contextual adaptation is modulated by spatial factors and is not per se limited by order effects in the learning process.