2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.00112
|View full text |Cite
|
Sign up to set email alerts
|

Towards End-to-End Unified Scene Text Detection and Layout Analysis

Abstract: We organize a competition on hierarchical text detection and recognition. The competition is aimed to promote research into deep learning models and systems that can jointly perform text detection and recognition and geometric layout analysis. We present details of the proposed competition organization, including tasks, datasets, evaluations, and schedule. During the competition period (from January 2nd 2023 to April 1st 2023), at least 50 submissions from more than 20 teams were made in the 2 proposed tasks. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 55 publications
(13 citation statements)
references
References 100 publications
0
8
0
Order By: Relevance
“…We also notice our heuristic-based grouping and ordering does not work well on curved text. Machine learning based grouping and ordering, such as [5,6,21,22] is the solution.…”
Section: E2e Baseline Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We also notice our heuristic-based grouping and ordering does not work well on curved text. Machine learning based grouping and ordering, such as [5,6,21,22] is the solution.…”
Section: E2e Baseline Resultsmentioning
confidence: 99%
“…For example, for layout analysis, we need to evaluate both grouping and ordering for the downstream applications. Hiertext [5] uses PQ for evaluating grouping exclusively, but not ordering. [6] uses three different metrics to capture different angles of the e2e performance: local accuracy (similar to the GO error in Section 3.2 based on leadership), local continuity (similar to ngram precisions in BLEU), and global accuracy (measuring exact block accuracy).…”
Section: Introductionmentioning
confidence: 99%
“…STKM [16] is a text knowledge mining model based on self-attention mechanisms for text detection tasks. Long et al [17] propose an end-to-end model that combines scene text detection and visual layout analysis, enhancing text detection performance.…”
Section: Related Workmentioning
confidence: 99%
“…The HA has been proposed and used for many document analysis and recognition tasks, many of them related with full-page, end-to-end training and/or text image recognition [56,33,34,29].…”
Section: Metrics Related With the Hungarian Algorithmmentioning
confidence: 99%