Political scientists in general and public law specialists in particular have only recently begun to exploit text classification using machine learning techniques to enable the reliable and detailed content analysis of political/legal documents on a large scale. This article provides an overview and assessment of this methodology. We describe the basics of text classification, suggest applications of the technique to enhance empirical legal research (and political science more broadly), and report results of experiments designed to test the strengths and weaknesses of alternative approaches for classifying the positions and interpreting the content of advocacy briefs submitted to the U.S. Supreme Court. We find that the Wordscores method (introduced by Laver et al. 2003), and various models using a Naïve Bayes classifier, perform well at accurately classifying the ideological direction of amicus curiae briefs submitted in the Bakke (1978) and Bollinger (2003) affirmative action cases. We also find that automated feature selection techniques can enable the detection of disparate issue conceptualizations by opposing sides in a single case, and facilitate analysis of relative linguistic “reliance” and “dominance” over time. We conclude by discussing the implications of our results and pointing to areas where technical and infrastructure improvements are most needed.
Using digital text analysis methods, we analyze over 3800 newspaper articles covering U.S. Supreme Court judicial appointments between 1981 and 2009 to measure the level of (in)congruence between the Court's agenda and the issues emphasized by the media. We find that newspapers highlight "culture war" issues at the expense of other important issues addressed much more frequently by the Court. Moreover, abortion in particular receives attention in more articles and in much greater depth than any other issue. With a few minor deviations, these patterns are consistent across nominations. These findings raise normative concerns regarding the didactic function of the print media in American democracy and shed empirical light on positive theories of media behavior.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.