In the recent methodological literature, various models have been proposed to account for the phenomenon that reversed items (defined as items for which respondents' scores have to be recoded in order to make the direction of keying consistent across all items) tend to lead to problematic responses. In this paper we propose an integrative conceptualization of three important sources of reversed item method bias (acquiescence, careless responding, and confirmation bias) and specify a multi-sample confirmatory factor analysis model with two method factors to empirically test the hypothesized mechanisms, using explicit measures of acquiescence and carelessness and experimentally manipulated versions of a questionnaire that varies three item arrangements and the keying direction of the first item measuring the focal construct. We explain the mechanisms, review prior attempts to model reversed item bias, present our new model, and apply it to responses to a four-item self-esteem scale (N = 306) and the six-item Revised Life Orientation Test (N = 595). Based on the literature review and the empirical results, we formulate recommendations on how to use reversed items in questionnaires.Key words: reverse-keyed items, method effects, response styles, survey research, structural equation modeling. 1 Reverse-keyed items are items for which respondents' scores have to be recoded (i.e., reflected about the midpoint of the rating scale) in order for all the items in a multi-item scale to have the same directional relationship with the underlying construct of interest. The use of reverse-keyed items (also called oppositely-keyed, reversed-polarity, reverse-worded, negatively worded, negatively-keyed, keyed-false, or simply reversed items) is sometimes recommended to disrupt non-substantive responding and to enable the detection and control of aberrant response behavior when it occurs (e.g., Nunnally, 1978, Chapter 15;Paulhus, 1991).However, research has shown that reversed items often lead to problems, particularly poor model fit of factor models (e.g., Marsh, 1986). In some cases, the problem is not simply that the model based on the originally hypothesized substantive factor structure is found to be inadequate, but that the lack of fit stimulates the revision of a more parsimonious conceptualization and the specification of additional substantive factors. For example, a unidimensional model in which high and low self-esteem are considered to be opposite poles of a single underlying continuum might be rejected in favor of a model in which separate, correlated factors are posited for positive and negative self-esteem corresponding to regular and reversed items (cf. Horan, DiStefano, & Motl, 2003;Motl & DiStefano, 2002).A variety of models have been proposed to take into account differences in responding to regular and reverse-keyed items and to avoid the mistaken specification of additional substantive factors, including models with method factors or correlated uniquenesses for either the regular or the reversed items ...