An earlier experiment (Meyer, Sleiderink, & Levelt, 1998) had shown that speakers naming object pairs usually inspected the objects in the required order of mention (left object fIrst) and that the viewing time for the left object depended on the word frequency of its name. In the present experiment, object pairs were presented simultaneously with auditory distractor words that could be phonologically related or unrelated to the name of the object to be named fIrst. The speech onset latencies and the viewing times for that object were shorter after related distractors than after unrelated distractors. Since this phonological priming effect, like the word frequency effect, most likely arises during wordform retrieval, we conclude that the shift of gaze from the fIrst to the second object is initiated after the word form of the fIrst object's name has been accessed.In studies oflanguage production, speakers often name single objects in one-word utterances (e.g., cross or ball). On the basis of the results of such studies, detailed models of object naming have been proposed (e.g., Glaser, 1992;Humphreys, Lamote, & Lloyd-Jones, 1995;Humphreys, Riddoch, & Quinlan, 1988;Levelt, Roelofs, & Meyer, 1999). Though adult speakers sometimes produce one-word utterances, they often (perhaps more often) say sentences in which they refer to several concepts and express their relationships. In order to fluently produce such utterances, speakers must select the concepts to be mentioned and the corresponding words in close temporal succession. The issue addressed in the present paper is how the planning processes for the words of an utterance are coordinated with each other in time.Before turning to the coordination ofthe planning processes, we will outline which processes take place when a speaker names a single object. Our working model of object naming (Levelt et aI., 1999) distinguishes between the visual-conceptual processes involved in object recognition and the following lexical access processes. Visual-conceptual processing comprises two steps. First, a percept is computed from the visual image. A percept is an integrated representation of the visual properties of the object, such as its shape, size, color, and current orientation. Second, an appropriate lexical concept is accessed. Lexical concepts can be viewed as nodes in a semantic network. Labeled connections (e.g., "is-a," "has-a") express their relationships (Roelofs, 1992). Lexical concepts differ from other concepts in that they have links to entries in the mental lexicon. Lexical access also comprisesThe authors thank Herbert Baumann, John Nagengast, and Johan Weustink for technical support and Andrew Ellis, David Irwin, Pim Levelt, Janice Murray, and an anonymous reviewer for helpful comments on the manuscript. Correspondence should be addressed to A. S. Meyer, School of Psychology, University of Birmingham, Edgbaston, Birmingham B15 2TT, England (e-mail: a.s.meyer@bham.ac.uk).two main steps. The first step is the selection of a syntactic word unit, a lemma. The...