2021
DOI: 10.1109/access.2021.3072900
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Automated Processing of the Unstructured Documents Using Artificial Intelligence: A Systematic Literature Review and Future Directions

Abstract: The unstructured data impacts 95% of the organizations and costs them millions of dollars annually. If managed well, it can significantly improve business productivity. The traditional information extraction techniques are limited in their functionality, but AI-based techniques can provide a better solution. A thorough investigation of AI-based techniques for automatic information extraction from unstructured documents is missing in the literature. The purpose of this Systematic Literature Review (SLR) is to r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 53 publications
(35 citation statements)
references
References 91 publications
0
35
0
Order By: Relevance
“…But older publications do not account for the technological shift of recent 10 years, when the full DIA process from image or video capture to full recognition and results presentation became possible directly on mobile or autonomous devices. At the same time, to the best of our knowledge, publications of recent years considered only separate tasks (such as document image classification [3], extraction of information from poorly structured documents [4], etc.) or advances of particular methods (mostly machine [16] and deep [20,21]…”
Section: Fig 2 Changes In the Number Of Citations Of Icdarmentioning
confidence: 99%
“…But older publications do not account for the technological shift of recent 10 years, when the full DIA process from image or video capture to full recognition and results presentation became possible directly on mobile or autonomous devices. At the same time, to the best of our knowledge, publications of recent years considered only separate tasks (such as document image classification [3], extraction of information from poorly structured documents [4], etc.) or advances of particular methods (mostly machine [16] and deep [20,21]…”
Section: Fig 2 Changes In the Number Of Citations Of Icdarmentioning
confidence: 99%
“…The automatic and efficient key field extraction task is one of the challenging tasks as its solution is spanned across the use of Computer Vision (CV) and Natural Language Processing (NLP) [6]. The unstructured documents such as invoices, claim processing forms usually do not comprise "natural language" as other regular documents or paragraphs.…”
Section: A Challenge In Extracting Information From Unstructured Documentsmentioning
confidence: 99%
“…Few of the challenges mentioned above, can be solved using Deep Learning (DL) approaches [6]. Automatic feature extraction and availability of pre-trained Neural Networks (NN) trained on huge unlabeled corpus are the main advantages of using DL approaches in information extraction tasks.…”
Section: Named Entity Recognition (Ner)mentioning
confidence: 99%
“…Specifically, in the case of publicly listed private firms, annual reports and financial statements are mandatory disclosures in the public domain. Knowledge extraction from such unstructured data is now possible with the recent developments in computer-aided text mining and Natural Language Processing (NLP) [ 6 8 ]. In this research, the authors explore the efficiency of NLP-based topic modeling algorithms to extract keywords and topics from the publicly available annual reports of construction contracting firms and use the information obtained to analyze the strategies such firms adopt in dealing with emerging sectoral challenges explained in the next section.…”
Section: Introductionmentioning
confidence: 99%