Towards Searchable Digital Urdu Libraries - A Word Spotting Based Retrieval Approach

Abidi, Ali Imam; Siddiqi, Imran; Khurshid, Khurram

doi:10.1109/icdar.2011.270

Cited by 22 publications

(11 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Then, the location quality γ n is computed for each retrieved area A Ret,n in the order of their rank n that is provided by the word spotting system. Each time a relevant area is hit according to (3), this relevant area is deleted and cannot be hit again. Therefore, the best match according to the keyword spotter wins and subsequently retrieved areas cannot score if they do not overlap with another relevant area.…”

Section: Proposed Evaluation Measuresmentioning

confidence: 99%

“…Thus, α IA punishes the missing inner area (IA), i. e., the relevant area from the ground truth that is not retrieved, while α OA penalizes the outer area (OA), i. e., the area that is outside the corresponding relevant area from the ground truth. The parameter c > 0 determines the maximum size of the outer area for which the retrieved area is still considered to be a hit according to (3).…”

Section: Proposed Evaluation Measuresmentioning

confidence: 99%

“…Segmentation-based approaches operate, for example, on character, character ngram [2], or other word part levels. Methods have also been proposed that apply an implicit word part segmentation by connected component analysis, e. g., for Urdu [3] and Arabic [4]. Other approaches rely on segmented words and use holistic features [5], i. e., features describing whole words, or use sliding windows [6].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

On Evaluation of Segmentation-Free Word Spotting Approaches without Hard Decisions

Pantke

Märgner

Fingscheidt

2013

2013 12th International Conference on Document Analysis and Recognition

View full text Add to dashboard Cite

Word spotting systems are intended to retrieve occurrences of a given keyword in document images without actually recognizing the full document content. As there is a trend towards segmentation-free word spotting methods, we propose a methodology to evaluate these methods by employing measures that take the quality of the retrieved word locations into account without making hard decisions. We derive a desired evaluation behavior with the help of synthetic examples and show discrepancies of existing evaluation methods. New measures following this behavior are introduced and their differences exemplarily described. The proposed evaluation method is applied to a state-of-the-art word spotting approach.

show abstract

Section: Proposed Evaluation Measuresmentioning

confidence: 99%

Section: Proposed Evaluation Measuresmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

On Evaluation of Segmentation-Free Word Spotting Approaches without Hard Decisions

Pantke

Märgner

Fingscheidt

2013

2013 12th International Conference on Document Analysis and Recognition

View full text Add to dashboard Cite

show abstract

“…The connected components are extracted in the binarized image of printed Urdu text by [10] to segment it into ligatures or partial words to which a set of two scalar and four vector features stored in the database represent.…”

Section: Segmentationmentioning

confidence: 99%

Offline Recognition of Handwritten Urdu Characters using B Spline Curves: A Survey

Jameel¹,

Kumar²

2017

IJCA

View full text Add to dashboard Cite

Handwritten Character Recognition is an active area of research in the field of pattern recognition and image processing for last two decades as there is an urgent need of having a successful Script Recognition System to convert handwritten documents into computer understandable form which is applicable for various purposes. Several research studies have been carried out for recognition of other scripts like Chinese, Japanese, English, Devanagari, etc. but the research regarding Urdu Script is still immature due to cursive and variable nature of Urdu characters. The requirement of offline Urdu HCR systems is increasing because of the expansion of technology and the convenience for users. In this paper, a detailed survey of Urdu HCR techniques with respect to feature extraction developed so far alongwith their efficiency and accuracy has been presented. The paper also presents a new proposed B-Spline Curve approximation approach for feature extraction of offline isolated Urdu handwritten characters.

show abstract

“…Despite the presence of some recent developments in layout analysis systems for Arabic and Urdu documents [9], the non-existence of commercial or opensource OCR techniques for these scripts make it difficult to navigate efficiently through scanned documents. Even the OCR-free technique presented by [7] can not be applied to these scripts due to highly non-uniform distribution of intra and inter word distances [10]. Moreover, lack of knowledge about location of the digits would make it impossible to differentiate between a ToC page and a page whose structure is similar to a ToC page.…”

mentioning

confidence: 99%

OCR-Free Table of Contents Detection in Urdu Books

Ul-Hasan

Bukhari

Breuel

2012

2012 10th IAPR International Workshop on Document Analysis Systems

View full text Add to dashboard Cite

Towards Searchable Digital Urdu Libraries - A Word Spotting Based Retrieval Approach

Cited by 22 publications

References 12 publications

On Evaluation of Segmentation-Free Word Spotting Approaches without Hard Decisions

On Evaluation of Segmentation-Free Word Spotting Approaches without Hard Decisions

Offline Recognition of Handwritten Urdu Characters using B Spline Curves: A Survey

OCR-Free Table of Contents Detection in Urdu Books

Contact Info

Product

Resources

About