A Combined Algorithm for Layout Analysis of Arabic Document Images and Text Lines Extraction

Alshameri, Abdulrahman; Abdou, Sherif M.; Mostafa, Khaled

doi:10.5120/7945-1282

Cited by 10 publications

(8 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Elanwar et al [16,20,23] proposed various analyses based on SVM (support vector machine) classifiers for extracting six logical labels from book pages. The same classifier was utilized by Alshameri et al [19] and Hesham et al [13,22] for text and non-text segmentation. Another learning technique used for segmentation and classification is neural network classification (Multilayer Perceptron-Back propagation) [9], whereas Ahmed et al [10] used k-means clustering and Gaussian Mixture Modelling (GMM).…”

Section: Arabic Document Analysis Methodsmentioning

confidence: 99%

“…Many studies successfully select the best binarization method relative to the benchmarking dataset's category. Several studies [11][12][13]22] have confirmed the efficiency of adaptive thresholding [35], such as Otsu thresholding [26] and Sauvola thresholding [27], whereas other studies [7,12,19] used filters for denoising, such as median filters [31] and Gaussian filters. The noise is presented as marginal noise, background noise, edge noise, rule line noise, pattern noise, and salt and pepper noise that can be created during scanning, transmission, or conversion to digital form, which affects the analysis process.…”

Section: A Preprocessing Phasementioning

confidence: 99%

“…• The top-down approach: where the page is considered as a single large zone, and then it will be successively divided into smaller zones until no zone remains for more division, which relies on finding the largest white rectangles possible and computing the horizontal and vertical histograms of the foreground pixels. A small number of studies [7,9,16] used the XY cut algorithm [37], whereas the well-known algorithm projection profile analysis [29] was used by a significant number of studies [7,8,11,12,14,15,[17][18][19];…”

Section: B Segmentation Phasementioning

confidence: 99%

See 2 more Smart Citations

A Review of Arabic Document Analysis Methods

Bouressace

2022

2022 4th International Conference on Pattern Analysis and Intelligent Systems (PAIS)

View full text Add to dashboard Cite

Arabic document analysis is essential in geometrical information extraction from complex structures in Arabic documents, which can either be historical or modern. This information can be an organized tree structure containing all the component levels, such as column, paragraph, word, table, figure, and article. In this paper, we provide an analysis of recent works on this topic from various perspectives, describing the most commonly used models on document physical layout detection and document logical structure representations in printed styles, summarizing the limitations of previous approaches, identifying challenges along this line of research, and providing new research directions for future algorithms.

show abstract

Section: Arabic Document Analysis Methodsmentioning

confidence: 99%

Section: A Preprocessing Phasementioning

confidence: 99%

Section: B Segmentation Phasementioning

confidence: 99%

See 1 more Smart Citation

A Review of Arabic Document Analysis Methods

Bouressace

2022

2022 4th International Conference on Pattern Analysis and Intelligent Systems (PAIS)

View full text Add to dashboard Cite

show abstract

“…The dataset provided by Bukhari et al [5] contains 25 images from books and newspapers, including multi-script images that contain both English and Arabic script; the Hadjar and Ingold datasets [11,12,13] contain between 50 to 150 pages from three different newspapers (Annahar, AL Hayat, and AL Quds), and the dataset by ElShameri et al [3] contains 200 pages from newspapers. The database by the Environmental Research Institute of Michigan [26] consists of 750 images of pages from machine-printed Arabic books and magazines.…”

Section: Existing Benchmarks For Dla Research Are Smallmentioning

confidence: 99%

BCE-Arabic-v1 dataset

Saad

Elanwar

Kader

et al. 2016

Proceedings of the 9th ACM International Conference on PErvasive Technologies Related to Assistive Environments

View full text Add to dashboard Cite

Millions of individuals in the Arab world have significant visual impairments that make it difficult for them to access printed text. Assistive technologies such as scanners and screen readers often fail to turn text into speech because optical character recognition software (OCR) has difficulty to interpret the textual content of Arabic documents. In this paper, we show that the inaccessibility of scanned PDF documents is in large part due to the failure of the OCR engine to understand the layout of an Arabic document. Arabic document layout analysis (DLA) is therefore an urgent research topic, motivated by the goal to provide assistive technology that serves people with visual impairments. We announce the launching of a large annotated dataset of Arabic document images, called BCE-Arabic-v1, to be used as a benchmark for DLA, OCR and text-to-speech research. Our dataset contains 1,833 images of pages scanned from 180 books and represents a variety of page content and layout, in particular, Arabic text in various fonts and sizes, photographs, tables, diagrams, and charts in single or multiple columns. We report the results of a formative study that investigated the performance of state-of-the-art document annotation tools. We found significant differences and limitations in the functionality and labeling speed of these tools, and selected the best-performing tool for annotating our benchmark BCE-Arabic-v1.

show abstract

“…• Alshameri et al in [4] presented a method for text/non-text segmentation and text line extraction from document images, where they used RLSA, CCs for text segmentation and an SVM for figure detection, by applying the ANDing and ORing operations to set the correct bounding-box for each category (text/figure). This technique gave interesting results, but the application of their RLSA is efficient only in certain special cases, where specific thresholds have to be applied, and specific vertical/horizontal projections are used to distinguish between CCs with a special spatial structure.…”

Section: Smartphone-captured Arabic Newspaper Analysismentioning

confidence: 99%

The application of new methods for offline recognition in printed Arabic documents

Bouressace¹

View full text Add to dashboard Cite

show abstract

A Combined Algorithm for Layout Analysis of Arabic Document Images and Text Lines Extraction

Cited by 10 publications

References 16 publications

A Review of Arabic Document Analysis Methods

A Review of Arabic Document Analysis Methods

BCE-Arabic-v1 dataset

The application of new methods for offline recognition in printed Arabic documents

Contact Info

Product

Resources

About