A core operation in speech production is the preparation of words from a semantic base. The theory of lexical access reviewed in this article covers a sequence of processing stages beginning with the speaker's focusing on a target concept and ending with the initiation of articulation. The initial stages of preparation are concerned with lexical selection, which is zooming in on the appropriate lexical item in the mental lexicon. The following stages concern form encoding, i.e., retrieving a word's morphemic phonological codes, syllabifying the word, and accessing the corresponding articulatory gestures. The theory is based on chronometric measurements of spoken word production, obtained, for instance, in picture-naming tasks. The theory is largely computationally implemented. It provides a handle on the analysis of multiword utterance production as well as a guide to the analysis and design of neuroimaging studies of spoken utterance production.T he human ability to speak is universal. All normal children acquire the language of their environment at a very early age. Most start babbling at the age of 7 months, produce a few meaningful words around their first birthday, reach a 50-word vocabulary 6 months later, produce their first multiword utterances by the end of their second year of life, and begin expressing syntactic relations by means of prepositions, auxiliaries, inflections, and word order in the course of their third year. By the age of 5 or 6, the basic architecture of this natural skill is essentially in place. Although our ability to speak has since millennia been recognized as uniquely human, as species-specific, as the basis of our cultural evolution, and generally as a core aspect of the human condition (homo loquens), the systematic study of how we speak did not begin before the end of the 19th century. In 1900, Wilhelm Wundt (1) published his theory about how a sentence emerges in the speaker's mind, a theory entirely based on introspection. With their 1896 monograph, Meringer and Mayer (2) initiated an important empirical paradigm. They collected and analyzed a large corpus of spontaneously produced speech errors that they had carefully noted down. One of their findings was that word substitutions were either meaning-based [e.g., Ihre (your) for meine (mine)] or form-based [e.g., Studien (studies) for Stunden (hours)], suggesting a distinction between meaningand form-based operations in word generation. It was only by the 1970s that this paradigm became fully exploited to construct theories of utterance generation (see ref. 3 for a review).