Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1348
|View full text |Cite
|
Sign up to set email alerts
|

Visual Detection with Context for Document Layout Analysis

Abstract: We present 1) a work in progress method to visually segment key regions of scientific articles using an object detection technique augmented with contextual features, and 2) a novel dataset of region-labeled articles. A continuing challenge in scientific literature mining is the difficulty of consistently extracting high-quality text from formatted PDFs. To address this, we adapt the object-detection technique Faster R-CNN for document layout detection, incorporating contextual information that leverages the i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
37
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 55 publications
(39 citation statements)
references
References 20 publications
2
37
0
Order By: Relevance
“…Table 3 provides a comparison of the DocBank to the previous document layout analysis datasets, including Article Regions (Soto and Yoo, 2019), GROTOAP2 (Tkaczyk et al, 2014), PubLayNet (Zhong et al, 2019), and TableBank (Li et al, 2019 19.36% Total 400,000 100.00% 50,000 100.00% 50,000 100.00% 500,000 100.00%…”
Section: Dataset Statisticsmentioning
confidence: 99%
“…Table 3 provides a comparison of the DocBank to the previous document layout analysis datasets, including Article Regions (Soto and Yoo, 2019), GROTOAP2 (Tkaczyk et al, 2014), PubLayNet (Zhong et al, 2019), and TableBank (Li et al, 2019 19.36% Total 400,000 100.00% 50,000 100.00% 50,000 100.00% 500,000 100.00%…”
Section: Dataset Statisticsmentioning
confidence: 99%
“…Nowadays, deep learning methods have become the mainstream for many machine learning problems (Yang et al, 2017;Borges Oliveira and Viana, 2017;Katti et al, 2018;Soto and Yoo, 2019). (Yang et al, 2017) propose a pixel-by-pixel classification to solve the document semantic structure extraction problem.…”
Section: Deep Learning Approachesmentioning
confidence: 99%
“…In this way, the model significantly outperforms approaches based on sequential text or document images. In addition, (Soto and Yoo, 2019) incorporate contextual information into the Faster R-CNN model. They involve the inherently localized nature of article contents to improve region detection performance.…”
Section: Deep Learning Approachesmentioning
confidence: 99%
“…Early methods apply markov random fields [7] or conditional random fields [2] to solve the problem. More recent approaches tend to use CNN [14,22,30], GNN [10], and Bert [28] to improve performance. Despite their good performance, these methods require large amounts of training data which is hard to collect due to privacy reasons.…”
Section: Text Field Labelingmentioning
confidence: 99%
“…The task aims to assign a label to each text region in a document so that text information could be extracted in structured formats. Learning based methods [2,10,14,22,30] are shown to have good performance for text field labeling. They could automatically adapt to any type of layouts, but they usually require sufficient training data.…”
Section: Introductionmentioning
confidence: 99%