2020
DOI: 10.1007/978-3-030-58604-1_39
|View full text |Cite
|
Sign up to set email alerts
|

Document Structure Extraction Using Prior Based High Resolution Hierarchical Semantic Segmentation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
11
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(11 citation statements)
references
References 44 publications
0
11
0
Order By: Relevance
“…Following the setting of the paper [15,64,65,66], we evaluate our model using accuracy, precision, recall and F1 as metrics.…”
Section: Methodsmentioning
confidence: 99%
“…Following the setting of the paper [15,64,65,66], we evaluate our model using accuracy, precision, recall and F1 as metrics.…”
Section: Methodsmentioning
confidence: 99%
“…A few studies have been conducted on the identification of tables in documents [1,2,3,4,5]. However, there is significantly less work put into detecting table structures, and the table structure is frequently classified by the rows and columns of a table [6,7,8].…”
Section: Introductionmentioning
confidence: 99%
“…Deep learning has recently achieving state-of-the-art using convolutional neural network (CNN) [9] in many tasks including object detection [10], face recognition [11], sequence to sequence learning [12,13], speech recognition [14], semantic segmentation [15], image classification [16], handwritten recognition [17,18,19], and table detection [1,8,6] is demanding because they need to classify tables among the texts and other figures. The presence of split columns or rows, as well as nested tables or embedded figures, makes the detection of a table even more difficult.…”
Section: Introductionmentioning
confidence: 99%
“…Additionally, [5] was unable to predict text entities that span multiple text lines, a problem solved with our dynamic graph editing. An alternative formulation to solve form understanding visually is to treat it as a pixel labeling problem, as in Sarkar et al [16]. However, it is not clear from [16] how to infer form structure (bounding boxes and relationships) from pixel predictions, and the proposed (even dilated) CNN model could have difficulty modeling relationships between spatially distant elements.…”
Section: Introductionmentioning
confidence: 99%
“…An alternative formulation to solve form understanding visually is to treat it as a pixel labeling problem, as in Sarkar et al [16]. However, it is not clear from [16] how to infer form structure (bounding boxes and relationships) from pixel predictions, and the proposed (even dilated) CNN model could have difficulty modeling relationships between spatially distant elements. Instead we use a GCN that directly predicts the form structure and does not need to rely on limited receptive fields to propagate information spatially.…”
Section: Introductionmentioning
confidence: 99%