At the interface between scene perception and speech production, we investigated how rapidly action scenes can activate semantic and lexical information. Experiment 1 examined how complex action-scene primes, presented for 150 ms, 100 ms, or 50 ms and subsequently masked, influenced the speed with which immediately following action-picture targets are named. Prime and target actions were either identical, showed the same action with different actors and environments, or were unrelated. Relative to unrelated primes, identical and same-action primes facilitated naming the target action, even when presented for 50 ms. In Experiment 2, neutral primes assessed the direction of effects. Identical and same-action scenes induced facilitation but unrelated actions induced interference. In Experiment 3, written verbs were used as targets for naming, preceded by action primes. When target verbs denoted the prime action, clear facilitation was obtained. In contrast, interference was observed when target verbs were phonologically similar, but otherwise unrelated, to the names of prime actions. This is clear evidence for word-form activation by masked action scenes. Masked action pictures thus provide conceptual information that is detailed enough to facilitate apprehension and naming of immediately following scenes. Masked actions even activate their word-form information–as is evident when targets are words. We thus show how language production can be primed with briefly flashed masked action scenes, in answer to long-standing questions in scene processing.