The Implicit Association Test is a paradigm designed to assess individual differences in implicit cognition. The goal of this report was to examine the reasons for discrepant effect magnitudes obtained with two presumably interchangeable versions: Picture-IAT (P-IAT) and Word-IAT (W-IAT). We show that this discrepancy is due to the relation between stimuli and referent category: the level of representation (LR) at which a stimulus represents an intended category. Experiment 1 replicates the discrepancies found in previous research. Experiments 2-4 show that increasing the LR of stimuli increases the IAT effect. LR affects the magnitude of the IAT effect even when modality and other features of the stimuli are kept constant. The utility of LR for future investigations examining the IAT paradigm is discussed. Copyright # 2009 John Wiley & Sons, Ltd.The Implicit Association Test (IAT; Greenwald et al., 1998; see also Devine, 2001) is a reaction time (RT) method developed as an indirect measure of individual differences in the strengths of associations among concepts (e.g., categories and attributes: for reviews see: Hoffman, Gawronski, Gschwendner, Le, & Schmitt, 2005;Nosek, Greenwald, & Banaji, 2006). Critical concerns regarding this paradigm have been partially addressed (e.g., Dasgupta, McGhee, Greenwald, & Banaji, 2000;Greenwald & Nosek, 2001;Nosek et al., 2006), yet the IAT shows serious limitations which require more elaboration (e.g., Blanton, Jaccard, Gonzales, & Christie, 2006;De Houwer & Moors, 2007;Klauer, Voss, Schmitz, & Teige-Mocigemba, 2007;Wentura & Rothermund, 2007).The present paper, in line with other research (e.g., Bluemke & Friese, 2006;De Houwer, 2002;Govan & Williams, 2004;Steffens & Jelenec, 2007) focuses on the modulating effect that features of the stimuli have on the IAT effect. In this paper, we take a novel look at the relation between the category labels and the stimuli to be classified. For the development of IATs, stimuli sets have been selected to avoid influence of known confounding variables such as social desirability (e.g., Greenwald et al., 1998;Ottaway, Hayden, & Oakes, 2001). Some authors suggest that items should be selected to instantiate the concepts of interest as closely as possible in order to make participants think of the label in the desired way (e.g., De Houwer, 2002;Govan & Williams, 2004;Olson and Fazio, 2003) but different versions have been developed to assess the same construct without much work to verify convergent validity across tests with the implicit assumption that different versions are interchangeable (i.e., supposedly measuring the same underlying construct). For instance, the Ethnic IAT (developed to measure the mental associations individuals hold about White-and Black-Americans) exists as a picture version (Picture-IAT) and a word version (Word-IAT) that present the same text labels representing a category but different stimuli sets for the relevant category (pictures of individuals and stereotypical first names printed in text, respectively).