This article introduces EsPal: a Web-accessible repository containing a comprehensive set of properties of Spanish words. EsPal is based on an extensible set of data sources, beginning with a 300 million token written database and a 460 million token subtitle database. Properties available include word frequency, orthographic structure and neighborhoods, phonological structure and neighborhoods, and subjective ratings such as imageability. Subword structure properties are also available in terms of bigrams and trigrams, biphones, and bisyllables. Lemma and part-of-speech information and their corresponding frequencies are also indexed. The website enables users either to upload a set of words to receive their properties or to receive a set of words matching constraints on the properties. The properties themselves are easily extensible and will be added over time as they become available. It is freely available from the following website: http:// www.bcbl.eu/databases/espal/.
Keywords Word frequency . Subtitles . Word recognition . Corpus linguistics . PsycholinguisticsResearchers from a wide range of disciplines (e.g., neuroscience, artificial intelligence, psychology, linguistics, and education, among others) who work in the interdisciplinary area of language research (e.g., language acquisition, language processing, language learning, bilingualism, and computational linguistics) need quick and efficient access to information about specific properties of words. For example, word frequency is a dominant factor in accounting for visual word recognition speed as measured by lexical decision times (Forster & Chambers, 1973;Monsell, 1991) and eye fixation durations during reading (Rayner, 2009). Unsurprisingly, reading behavior as measured by, for example, lexical decision, naming, fixation times, and so on is affected by a wide range of other properties of words, including orthographic neighborhood