2021
DOI: 10.48550/arxiv.2111.08609
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Document AI: Benchmarks, Models and Applications

Abstract: Document AI, or Document Intelligence, is a relatively new research topic that refers to the techniques for automatically reading, understanding, and analyzing business documents. It is an important research direction for natural language processing and computer vision. In recent years, the popularity of deep learning technology has greatly advanced the development of Document AI, such as document layout analysis, visual information extraction, document visual question answering, document image classification,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 105 publications
(77 reference statements)
0
10
0
Order By: Relevance
“…In recent years, pre-training techniques have been making waves in the Document AI community by achieving remarkable progress on document understanding tasks [2,12-14, 16, 25, 28, 29, 36, 37, 44, 45, 47-49]. As shown in Figure 1, a pre-trained Document AI model can parse layout and extract key information for various documents such as scanned forms and academic papers, which is important for industrial applications and academic research [7].…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, pre-training techniques have been making waves in the Document AI community by achieving remarkable progress on document understanding tasks [2,12-14, 16, 25, 28, 29, 36, 37, 44, 45, 47-49]. As shown in Figure 1, a pre-trained Document AI model can parse layout and extract key information for various documents such as scanned forms and academic papers, which is important for industrial applications and academic research [7].…”
Section: Introductionmentioning
confidence: 99%
“…It leads to an important research direction for both Computer Vision (CV) and Natural Language Processing (NLP), and is a fundamental task of Document AI, which aims to automatically read, understand, and analyze documents. [1].…”
Section: Introductionmentioning
confidence: 99%
“…Visually-rich Document Understanding (VrDU) is a critical component of document intelligence [6] that aims to understand scanned or digital-born documents. Despite many advances in vision-language understanding, extracting structural information in visually-rich documents remains a major challenge because it involves different types of information, including image, text, and layout.…”
Section: Introductionmentioning
confidence: 99%