2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2021
DOI: 10.1109/jcdl52503.2021.00030
|View full text |Cite
|
Sign up to set email alerts
|

ScanBank: A Benchmark Dataset for Figure Extraction from Scanned Electronic Theses and Dissertations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 14 publications
(13 citation statements)
references
References 21 publications
0
13
0
Order By: Relevance
“…Through ablation experiments we find the combination of the page and hOCR properties of (grayscale, ascenders, decenders, word confidences, fraction of numbers in a word, fraction of letters in a word, punctuation, word rotation and spaCy POS) maximize our model's performance. When compared to other deep learning models popular for document layout analysis (ScanBank [21,47] and de-tectron2 [45]) we find our model performs better on our dataset, particularly at the high IOU thresholds (IOU=0.9) and especially for figure captions. In particular, in line with our extraction goals, our model has relatively low false positive rates, minimizing the extraction of erroneous page objects.…”
Section: Discussion and Future Workmentioning
confidence: 76%
See 4 more Smart Citations
“…Through ablation experiments we find the combination of the page and hOCR properties of (grayscale, ascenders, decenders, word confidences, fraction of numbers in a word, fraction of letters in a word, punctuation, word rotation and spaCy POS) maximize our model's performance. When compared to other deep learning models popular for document layout analysis (ScanBank [21,47] and de-tectron2 [45]) we find our model performs better on our dataset, particularly at the high IOU thresholds (IOU=0.9) and especially for figure captions. In particular, in line with our extraction goals, our model has relatively low false positive rates, minimizing the extraction of erroneous page objects.…”
Section: Discussion and Future Workmentioning
confidence: 76%
“…Using the definitions of captions from Table 3 for 201 captions the results are more promising with F1 scores of 65.6% (ours) to 65.5%(detectron2). Results for our (ScanBank's) model applied to the ScanBank collection of "gold-standard" ETDs [21,47] are lower overall with F1 scores of 25.5%(38.4%) for 197 figures and 16.3%(1.4%) for 140 captions. While these results suggest that our model may be more generalizable than other models for figure captions, tests on larger datasets are necessary for a firmer conclusion.…”
Section: Benchmarks At High Levels Of Localization (Iou=09)mentioning
confidence: 90%
See 3 more Smart Citations