Preferential trade agreements (PTAs) form an intricate web that connects countries across the globe. In this article, we introduce a PTA text corpus and research tools for its finegrained, automated analysis. Recent computational advances allow for efficient and effective content analysis by treating text as data. We digitize PTA texts and use textual similarity tools to assess PTA design patterns on the global, national, and chapter level. Our descriptive analysis reveals, inter alia, that PTAs are more heterogeneous as a group than, for instance, bilateral investment agreements, but that they converge in regional or inter-regional clusters of similarly worded agreements. Following our descriptive account, we provide three concrete, interdisciplinary examples of how text-as-data analysis can advance the study of trade economics, politics, and law. In trade economics, similarity measures can provide more detailed representations of PTA design differences. These allow researchers to capture more meaningful variation when studying the economic impact of PTAs. In trade politics, scholars can use treaty similarity to trace design diffusion more accurately and test competing explanations for treaty design choices. Finally, in trade law, similarity measures offer new insights into the processes of normative convergence between legal regimes such as trade and investment law.
With multilateral negotiations at the World Trade Organization (WTO) in deadlock, rulemaking on international economic governance has shifted to preferential trade agreements (PTAs). To facilitate the scholarly investigation of the fast‐growing universe of PTAs, this article introduces a machine‐readable and structured full text corpus of 448 WTO‐notified trade agreements stored on a Github repository—the Text of Trade Agreements (ToTA) corpus. The article (1) provides a summary analysis of the ToTA corpus, (2) illustrates how text‐as‐data techniques can be used to investigate PTA design using ToTA, including through an interactive website accompanying this research, and (3) concludes with an overview of research applications involving this PTA text corpus in economics, political science, and law. The current codebook is attached herein as an appendix. The dataset, codebook, and code, as updated, are available at https://github.com/mappingtreaties/tota.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.