This article describes the Vaccine/Injury Project Corpus, a collection of legal decisions awarding or denying compensation for health injuries allegedly due to vaccinations, together with models of the logical structure of the reasoning of the factfinders in those cases. This unique corpus provides useful data for formal and informal logic theory, for natural-language research in linguistics, and for artificial intelligence research. More importantly, the article discusses lessons learned from developing protocols for manually extracting the logical structure and generating the logic models. It identifies sub-tasks in the extraction process, discusses challenges to automation, and provides insights into possible solutions for automation. In particular, the framework and strategies developed here, together with the corpus data, should allow ''top-down'' and contextual approaches to automation, which can supplement ''bottom-up'' linguistic approaches. Illustrations throughout the article use examples drawn from the Corpus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.