2023
DOI: 10.1017/aap.2022.44
|View full text |Cite
|
Sign up to set email alerts
|

Creating a Software Methodology to Analyze and Preserve Archaeological Legacy Data

Abstract: Software now allows archaeologists to document excavations in more detail than ever before through rich, born-digital datasets. In comparison, paper documentation of past excavations (a valuable corpus of legacy data) is prohibitively difficult to work with. This pilot study explores creating custom software to digitize paper field notes from the 1970s excavations of the Gulkana site into machine-readable text and maps to be compatible with born-digital data from subsequent excavations in the 1990s. This site,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 42 publications
0
2
0
Order By: Relevance
“…The ArchaeoSRP package was able to successfully scan and extract information from 94% of the forms (n = 721, total = 770). Forms that were not successfully extracted by ArchaeoSRP were those that contained handwritten information in multiple fields (training Tesseract to read handwriting is possible but difficult [see Fletcher 2023]) or that were recorded on unique forms with nonstandard formatting. The computation time for this procedure took approximately five hours on a desktop computer, which represents a substantial improvement in time expenditure over digitizing and data entry by hand.…”
Section: Extracting Data From Paper Records Using Archaeosrpmentioning
confidence: 99%
See 1 more Smart Citation
“…The ArchaeoSRP package was able to successfully scan and extract information from 94% of the forms (n = 721, total = 770). Forms that were not successfully extracted by ArchaeoSRP were those that contained handwritten information in multiple fields (training Tesseract to read handwriting is possible but difficult [see Fletcher 2023]) or that were recorded on unique forms with nonstandard formatting. The computation time for this procedure took approximately five hours on a desktop computer, which represents a substantial improvement in time expenditure over digitizing and data entry by hand.…”
Section: Extracting Data From Paper Records Using Archaeosrpmentioning
confidence: 99%
“…However, many of these databases still exist as paper forms or digital copies of paper forms, which limits their potential for data extraction, analysis, and comparison. Making these datasets more accessible for management and research would require a substantial investment in digitizing and making text from paper forms machine readable to enable researchers to extract and synthesize information from these sources (Fletcher 2023). These efforts are currently underway in many federal repositories (e.g., USDA Forest Service Heritage Natural Resource Manager [NRM] Database) and through nonprofit and university-affiliated repositories (e.g., tDAR [McManamon et al 2017], Open Context [Kansa et al 2020], Archaeology Data Service [Wright and Richards 2018]); however, they are primarily focused on data management or preservation rather than dataset synthesis or analysis.…”
mentioning
confidence: 99%