Digital cultural assets are often thought to exist in separate spheres based on their two principal points of origin: digitized and born digital. Increasingly, advances in digital curation are blurring this dichotomy, by introducing so-called “collections as data,” which regardless of their origination make cultural assets more amenable to the application of new computational tools and methodologies. This paper brings together archivists, scholars, and technologists to demonstrate computational treatments of digital cultural assets using Artificial Intelligence (AI) and Machine Learning (ML) techniques that can help unlock hard-to-reach archival content. It describes an extended, iterative study applied to digitized and datafied WWII-era records housed at the FDR Presidential Library, rich content that is regrettably under-utilized by scholars examining American responses to the Holocaust. Authors detail the benefits of interdisciplinary collaboration for evaluating user needs, identifying and applying tools and methodologies (including ML through object detection and AI through Named Entity Recognition or NER), and reaching the real-world outcome of public access to augmented data. They also discuss issues of digital representation, relational context, and interface design to enable new modes of public and scholarly access. While based on a case study, we believe that this work is a substantial contribution to revealing the strengths and weaknesses of using AI/ML systems in cultural organizations. We give particular care to lessons learned, and generalize the approach taken across broad classes of collections with a focus on responsive iterations, reproducibility, and the relevance of data and its structures to users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.