2022
DOI: 10.18287/2412-6179-co-1006
|View full text |Cite
|
Sign up to set email alerts
|

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

Abstract: Identity documents recognition is an important sub-field of document analysis, which deals with tasks of robust document detection, type identification, text fields recognition, as well as identity fraud prevention and document authenticity validation given photos, scans, or video frames of an identity document capture. Significant amount of research has been published on this topic in recent years, however a chief difficulty for such research is scarcity of datasets, due to the subject matter being protected … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 19 publications
(23 citation statements)
references
References 57 publications
1
22
0
Order By: Relevance
“…The GDPR [ 7 ] and other local laws prohibit the creation of datasets with real ID images. Thus, researchers began to use artificially generated ID document images for open dataset creation [ 8 , 9 , 10 , 11 , 12 ]. As far as we know, printed mock documents are used only in MIDV family datasets, and MIDV-500 was the first [ 8 ].…”
Section: Overviewmentioning
confidence: 99%
See 3 more Smart Citations
“…The GDPR [ 7 ] and other local laws prohibit the creation of datasets with real ID images. Thus, researchers began to use artificially generated ID document images for open dataset creation [ 8 , 9 , 10 , 11 , 12 ]. As far as we know, printed mock documents are used only in MIDV family datasets, and MIDV-500 was the first [ 8 ].…”
Section: Overviewmentioning
confidence: 99%
“…The dataset was also supplemented with photos and scanned images of the same document types to represent the typical input for server-side identity document analysis systems. MIDV-2020 [ 10 ] was published recently to provide variability in the text fields, faces, and signatures, while retaining the realism of the dataset. The MIDV-2020 dataset consists of 1000 different physical documents (100 documents per type), all with unique, artificially generated faces, signatures, and text field data.…”
Section: Overviewmentioning
confidence: 99%
See 2 more Smart Citations
“…The associate editor coordinating the review of this manuscript and approving it for publication was Jiju Poovvancheri . location and classification [2], [3]. A high-quality solution to these problems plays a significant role in the document recognition process.…”
Section: Introductionmentioning
confidence: 99%