2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2020
DOI: 10.1109/cvprw50498.2020.00289
|View full text |Cite
|
Sign up to set email alerts
|

Visual and Textual Deep Feature Fusion for Document Image Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
31
0
1

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 36 publications
(33 citation statements)
references
References 18 publications
1
31
0
1
Order By: Relevance
“…Subsequently, these are used by other neural network modules or combined with other features, e.g.., textual ones. Algorithms using textual information, like [5] for document classification, use word or paragraph embeddings created by deep learning frameworks like BERT [9]. BERT stands for "Bidirectional Encoder Representations from Transformers" which is a transformerbased model used for NLP tasks.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Subsequently, these are used by other neural network modules or combined with other features, e.g.., textual ones. Algorithms using textual information, like [5] for document classification, use word or paragraph embeddings created by deep learning frameworks like BERT [9]. BERT stands for "Bidirectional Encoder Representations from Transformers" which is a transformerbased model used for NLP tasks.…”
Section: Related Workmentioning
confidence: 99%
“…The newspapers made available for this competition comprise the titles "Arbeiter Zeitung", "Illustrierte Kronen Zeitung", "Innsbrucker Nachrichten" and "Neue Freie Presse". The data can be downloaded from the competition website 5 .…”
Section: Datamentioning
confidence: 99%
“…В [4] предложен подход, основанный на выделении, анализе и объединении текстового и визуального потоков для классификации изображений документов, в визуальном потоке используются глубокие CNN для извлечения структурных особенностей изображений, точность зависит от вида входных данных. В исследовании [5] предлагается двухпоточная нейронная архитектура для выполнения задачи классификации изображений документов, при этом используется подход совместного обучения признаков, объединяющий признаки изображения и текстовые части, подход совместного обучения имеет точность классификации до 97,05 %. Преимуществом использования нейросетевого подхода является отказ от шаблонов.…”
Section: Abstract: Document Management Automation Intelligent Document Management Document Classification Convolutional Neural Network Imunclassified
“…The classification of documents into different known classes help to improve the overall performance of document processing systems [1]. Consequently, many approaches are proposed for document classification that uses either text content [3][4][5] or document structure [6][7][8][9] to categorize documents into different classes or use both of the modalities [10][11][12][13]. There has been much advancement in this area, especially using deep learning methods [6,14,15].…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, these document images convey high-level structural information with their features, but the low-level features that can disambiguate visually similar images remain uninvestigated for a long time. Various papers investigate the possibility of involving additional features to improve the accuracy like [10], [11] and [13]. These papers obtained state-of-the-art results.…”
Section: Introductionmentioning
confidence: 99%