2014
DOI: 10.1007/s10044-014-0420-8
|View full text |Cite
|
Sign up to set email alerts
|

Cross-document word matching for segmentation and retrieval of Ottoman divans

Abstract: Cataloged from PDF version of article.Motivated by the need for the automatic\ud indexing and analysis of huge number of documents in\ud Ottoman divan poetry, and for discovering new knowledge\ud to preserve and make alive this heritage, in this study we\ud propose a novel method for segmenting and retrieving\ud words in Ottoman divans. Documents in Ottoman are dif-\ud ficult to segment into words without a prior knowledge of\ud the word. In this study, using the idea that divans have\ud multiple copies (versi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0
2

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 51 publications
0
4
0
2
Order By: Relevance
“…This project was used by scholars working with the Ottoman archives. As an extension of the previous work in Reference 14, a method was proposed for segmenting and retrieving words in Ottoman divans (books of poetry), motivated by the need for the automatic indexing and analysis of Ottoman divan poetry in Reference 15. Since most diwans have multiple copies with minor differences, a cross document word matching approach was employed for querying and retrieving the text from documents.…”
Section: Related Workmentioning
confidence: 99%
“…This project was used by scholars working with the Ottoman archives. As an extension of the previous work in Reference 14, a method was proposed for segmenting and retrieving words in Ottoman divans (books of poetry), motivated by the need for the automatic indexing and analysis of Ottoman divan poetry in Reference 15. Since most diwans have multiple copies with minor differences, a cross document word matching approach was employed for querying and retrieving the text from documents.…”
Section: Related Workmentioning
confidence: 99%
“…There is quite a limited number of studies dealing with Ottoman documents in the literature. Most of them are focusing on retrieval of Ottoman documents using traditional computer vision techniques used to map word shapes [10,11,13,18,56]. Some others constitute the early works on Ottoman character recognition using conventional machine learning algorithms with quite limited proprietary datasets [32,33,42,67].…”
Section: Arabic Text Recognitionmentioning
confidence: 99%
“…A very limited number of studies on text recognition in Ottoman Turkish have been identified in the literature. Most of them are dated to pre-deep learning era and use traditional machine learning techniques [12,17,18,26]. In a study that used deep learning techniques on Ottoman documents for the first time, Aydemir et al trained an RNN system by manually extracting features from a dataset containing 169,148 discrete handwritten word images obtained from population registration documents [13].…”
Section: Related Workmentioning
confidence: 99%
“…Actually, Ottoman document recognition is a problem that has been attempted for many years, without a sufficiently successful solution. Most of the previous works are on document retrieval tasks using traditional machine learning methods [12,17,26]. Their modest success rates can be attributed to limited sizes of the datasets they use.…”
Section: Introductionmentioning
confidence: 99%