In an auditory lexical decision experiment, 5541 spoken content words and pseudowords were presented to 20 native speakers of Dutch. The words vary in phonological make-up and in number of syllables and stress pattern, and are further representative of the native Dutch vocabulary in that most are morphologically complex, comprising two stems or one stem plus derivational and inflectional suffixes, with inflections representing both regular and irregular paradigms; the pseudowords were matched in these respects to the real words. The BALDEY ("biggest auditory lexical decision experiment yet") data file includes response times and accuracy rates, with for each item morphological information plus phonological and acoustic information derived from automatic phonemic segmentation of the stimuli. Two initial analyses illustrate how this data set can be used. First, we discuss several measures of the point at which a word has no further neighbours and compare the degree to which each measure predicts our lexical decision response outcomes. Second, we investigate how well four different measures of frequency of occurrence (from written corpora, spoken corpora, subtitles, and frequency ratings by 75 participants) predict the same outcomes. These analyses motivate general conclusions about the auditory lexical decision task. The (publicly available) BALDEY database lends itself to many further analyses.