2016 12th IAPR Workshop on Document Analysis Systems (DAS) 2016
DOI: 10.1109/das.2016.58
|View full text |Cite
|
Sign up to set email alerts
|

Complete System for Text Line Extraction Using Convolutional Neural Networks and Watershed Transform

Abstract: We present a novel Convolutional Neural Networkbased method for the extraction of text lines, which consists of an initial Layout Analysis followed by the estimation of the Main Body Area (i.e., the text area between the baseline and the corpus line) for each text line. Finally, a region-based method using watershed transform is performed on the map of the Main Body Area for extracting the resulting lines. We have evaluated the new system on the IAM-HisDB, a publicly available dataset containing historical doc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 36 publications
(18 citation statements)
references
References 21 publications
0
18
0
Order By: Relevance
“…The closest work related to our method are multi-step methods, presented by Pastor et al [14] and Gruüning et al [15]. The former employs a multi-stage deep learning approach to detect text regions followed by watershed-transform as post-processing step.…”
Section: Related Workmentioning
confidence: 99%
“…The closest work related to our method are multi-step methods, presented by Pastor et al [14] and Gruüning et al [15]. The former employs a multi-stage deep learning approach to detect text regions followed by watershed-transform as post-processing step.…”
Section: Related Workmentioning
confidence: 99%
“…More recently, [12] used CNNs for textline extraction also aiming at a robust method explicitly focusing on historical documents. However, textlines are segmented at a different level of detail, returning a surrounding polygon (Main Body Area) at pixel level.…”
Section: Introductionmentioning
confidence: 99%
“…Then, the text lines are extracted by superimposing the components with text line pattern mask. Another method is based on convolution neural network with watershed transform, which is proposed in [10] to estimate the text area between the baseline and the corpus line.…”
Section: Introductionmentioning
confidence: 99%