Automated and statistical methods for estimating latent political traits and classes from textual data hold great promise, since virtually every political act involves the production of text. Statistical models of natural language features, however, are heavily laden with unrealistic assumptions about the process that generates this data, including the stochastic process of text generation, the functional link between political variables and observed text, and the nature of the variables (and dimensions) on which observed text should be conditioned. While acknowledging statistical models of latent traits to be "wrong", political scientists nonetheless treat the treat their results as sufficiently valid to be useful. In this paper, we address the issue of substantive validity in the face of potential model failure, in the context of unsupervised scaling methods of latent traits. We critically examine one popular parametric measurement model of latent traits for text and then compare its results to systematic human judgments of the texts as a benchmark for validity. * This research was supported by the European Research Council grant ERC-2011-StG 283794-QUANTESS. Will Lowe is at MZES, University of Mannheim and Kenneth Benoit is at the Department of Methodology, London School of Economics and the Department of Political Science, Trinity College Dublin.
1A vast amount of effort in political science focuses on estimating characteristics of political actors-parties, legislators, candidates, voters, and so on-that may be estimated, but never directly observed. Whether we call them "ideal points", policy preferences, topics, or issue emphases, these latent traits and latent classes are not only fundamentally unobservable, but also exist in a dimensional space that is fundamentally unknowable. 1 This has hardly prevented political researchers from attempting to identify and estimate such quantities, however, and a variety of such methods are widely used. Many, such as the analysis of roll call votes, suffer from problems of data censorship and selection that produce biased estimates of the quantities desired. Not all actors vote, co-sponsor bills, or return our questionnaires, but there is one activity that always accompanies political action: speech.This simple fact, coupled with a revolution in the availability of vast quantities of recorded text and speech, has spurred the development of a wide range of methods for analyzing textual data, most of which are surveyed in Grimmer and Stewart (Forthcoming).Every statistical model applied to data-textual or otherwise-requires assumptions.As Grimmer and Stewart (Forthcoming) point out, such assumptions are always wrong.For textual data, these assumptions concern: the process that generates the observed textual data, including the stochastic process of text generation; the functional model linking political variables of interest and observed text; and the nature of the variables (and dimensions) on which observed text should be conditioned. The reality is that the wh...