Addressing archaeology's most compelling substantive challenges requires synthetic research that exploits the large and rapidly expanding corpus of systematically collected archaeological data. That, in turn, requires a means of combining datasets that employ different systematics in their recording while at the same time preserving the semantics of the data. To that end, we have developed a general procedure that we call query-driven, on-the-fly data integration that is deployed within the Digital Archaeological Record digital repository. The integration procedure employs ontologies that are mapped to the original datasets. Integration of the ontology-based dataset representations is done at the time the query is executed, based on the specific content of the query. In this way, the original data are preserved, and data are aggregated only to the extent necessary to obtain semantic comparability. Our presentation draws examples from the largest application to date: an effort by a research community of Southwest US faunal analysts. Using 24 ontologies developed to cover a broad range of observed faunal variables, we integrate faunal data from 33 sites across the late prehistoric northern Southwest, including about 300,000 individually recorded faunal specimens.
Hundreds of thousands of archaeological investigations in the United States conducted over the last several decades have documented a large portion of the recovered archaeological record in the United States. However, if we are to use this enormous corpus to achieve richer understandings of the past, it is essential that both CRM and academic archaeologists change how they manage their digital documents and data over the course of a project and how this information is preserved for future use. We explore the nature and scope of the problem and describe how it can be addressed. In particular, we argue that project workflows must ensure that the documents and data are fully documented and deposited in a publicly accessible, digital repository where they can be discovered, accessed, and reused to enable new insights and build cumulative knowledge.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.