One of the most important predictors of word processing times is the frequency with which words have been encountered. In large-scale studies, word frequency (WF) reliably explains the largest percentage of variance of any predictor of word processing times (e.g., Baayen, Feldman, & Schreuder, 2006;Balota, Cortese, Sergent-Marshall, Spieler, & Yap, 2004; Yap & Balota, 2009). Therefore, psycholinguists have invested time in the collection of WF measures. The first list of word frequencies widely used in language research was published in English by Thorndike and Lorge (1944; see Bontrager, 1991, for a review of older frequency lists including German ones). Its main motivation was educational (helping teachers decide which words should be taught to pupils). A few decades later, Ku era and Francis (1967; KF) published a list (also for American English) that would become the frequency measure of choice for language researchers up to the present (Brysbaert & New, 2009).For the Dutch language, van Berckel, Brandt Corstius, Mokken, and van Wijngaarden (1965) collected word frequencies based on a newspaper corpus of about 50,000 words. Although this list contained additional statistical information, such as ngram sequences up to three letters, about the Dutch language, it did not gain wide adoption. The first publicly available frequency list for Dutch was edited by Uit den Boogaart (1975), who published frequencies of "written and spoken Dutch" based on a corpus of 605,733 words from written sources and 121,569 words from spoken sources. This book was superseded in 1993, when the Centre for Lexical Information (CELEX) published frequencies based on a 42-million-word corpus of written texts collected by the Institute for Dutch Lexicology (Baayen, Piepenbrock, & van Rijn, 1993). In addition to the frequencies of the different forms (e.g., play, plays), the CELEX database also contained the frequencies of the words as different parts of speech ( play as a noun vs. play as a verb) and the frequencies of the headwords or lemmas (e.g., the frequency of the nominal lemma play consisting of the summed frequency of the word form play as a noun and the word form plays as a noun). Since its publication, CELEX has been the primary source of word frequencies and other lexical information for the Dutch language. 1 For a long time, face validity was the main factor in assessing the quality of a frequency measure for research in word recognition. Two criteria were of importance: the representativeness of the sources and the size of the corpus. On both criteria, CELEX scored well. Special care had been taken to select texts from a wide variety of documents produced by the Dutch-speaking community, and the size of the corpus was larger than what was available in most other languages. However, in the past 2 years, researchers have started to measure the validity of word frequencies for research into word recognition processes by correlating them with word processing times for thousands of words. This research has revealed considerable qu...