Multiword units (MWUs) are word combinations which sit within the continuum of formulaic language. Many experimental studies have focused on the online processing of MWUs by native and non-native speakers, and the processing of idioms in particular. However, some studies use a mix of various MWU subtypes, while other studies have varying definitions for the same subtypes. For results from MWU studies to be useful to theories of language processing, storage and access, clearer classifications are needed for MWU subtypes. This study aims to empirically validate MWU categories as described by certain phraseologists in the European tradition. This will be done using MWUs from the British National Corpus, from across the continuum of frequent to infrequent occurrence and co-occurrence. Hence, in this paper I will describe the empirical findings that may validate the classifications for MWU categories of restricted collocations, idioms, and lexical bundles, using corpus-based measures and human ratings.