Automatic content analysis (ACA) is a technique for coding messages with the help of computer algorithms. Unlike computer-aided content analysis, ACA is defined as any method in which the actual coding decision, that is, assigning codes to documents or single textual or audiovisual elements, does not require human judgment and therefore is performed automatically. Since ACA relies on the computing capabilities of machines rather than human coders, it can be applied to very large documents. Moreover, automatic coding is highly reliable in that any analysis can be reproduced exactly given the same material and software and all errors are deterministic, that is, stemming from misspecification or program errors.
Historical developmentThe history and development of ACA can be understood through three central themes: (i) the concept of content analysis and its computational implementation, (ii) the development of software for automatic data processing and analysis, and (iii) the provision and analysis of digital or machine-readable documents. The first phase of development of computer-assisted and automatic content analysis in the late 1950s was mainly characterized by experiments with the computer. For this, social scientists had to first learn to program the computer, and in most cases, this involved the support of the science departments. The first few experiments were almost exclusively limited to producing text statistics, that is, counting words, a technique that had been applied in many disciplines such as political science or literature since the 1920s (Holsti, 1969;Stone, Dunphy, Smith, & Ogilvie, 1966). Conceptually, these simple automatic analyses fell well behind the advancements made in conventional content analysis by Lasswell, Lerner, and de Sola Pool (1952) or Osgood (1959). However, early content analysts were increasingly troubled by the costs of manual coding and had high expectations from automatic approaches, the development of which was viewed as central to the success of the method (Stone, 1997). At this time, computer-assisted content analysis was a high-risk research domain, plagued with numerous problems. On the one hand, primitive computing hardware and software allowed only small amounts of text to be analyzed with a few variables (Iker & Harway, 1965). On the other hand, because of the dearth of machine-readable documents, all units of analysis had to be digitized in a complex and error-prone process on punch cards. Thus, in the early 1960s, ACA was neither cost-effective-half an