Some information contained in historical topographic maps has yet to be captured digitally, which limits the ability to automatically query such data. For example, U.S. Geological Survey’s historical topographic map collection (HTMC) displays millions of spot elevations at locations that were carefully chosen to best represent the terrain at the time. Although research has attempted to reproduce these data points, it has proven inadequate to automatically detect and recognize spot elevations in the HTMC. We propose a deep learning workflow pretrained using large benchmark text datasets. To these datasets we add manually crafted training image/label pairs, and test how many are required to improve prediction accuracy. We find that the initial model, pretrained solely with benchmark data, fails to predict any HTMC spot elevations correctly, whereas the addition of just 50 custom image/label pairs increases the predictive ability by ∼50%, and the inclusion of 350 data pairs increased performance by ∼80%. Data augmentation in the form of rotation, scaling, and translation (offset) expanded the size and diversity of the training dataset and vastly improved recognition accuracy up to ∼95%. Visualization methods, such as heat map generation and salient feature detection, can be used to better understand why some predictions fail.