2019
DOI: 10.1093/comjnl/bxz098
|View full text |Cite
|
Sign up to set email alerts
|

The NoisyOffice Database: A Corpus To Train Supervised Machine Learning Filters For Image Processing

Abstract: This paper presents the ‘NoisyOffice’ database. It consists of images of printed text documents with noise mainly caused by uncleanliness from a generic office, such as coffee stains and footprints on documents or folded and wrinkled sheets with degraded printed text. This corpus is intended to train and evaluate supervised learning methods for cleaning, binarization and enhancement of noisy images of grayscale text documents. As an example, several experiments of image enhancement and binarization are present… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 36 publications
0
7
0
Order By: Relevance
“…We measure the detection rates for our algorithm. Results are shown in Table 2 and Table 3, for the NoisyOffice dataset of scanned images [36][37][38] and the MSRCv2 dataset of more generic images [39], respectively. The main measurement of detectability is the Area under the Receiver Operating Curve (AROC).…”
Section: Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…We measure the detection rates for our algorithm. Results are shown in Table 2 and Table 3, for the NoisyOffice dataset of scanned images [36][37][38] and the MSRCv2 dataset of more generic images [39], respectively. The main measurement of detectability is the Area under the Receiver Operating Curve (AROC).…”
Section: Resultsmentioning
confidence: 99%
“…Table 1 lists the average computation times, both of our algorithm and of [35], for the MSRCv2 [39] and NoisyOffice [36][37][38] datasets. Tests were run on an Intel 7th Generation Core i7 notebook computer.…”
Section: Computation Timementioning
confidence: 99%
See 3 more Smart Citations