2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT) 2014
DOI: 10.1109/icaict.2014.7035953
|View full text |Cite
|
Sign up to set email alerts
|

On certain aspects of Kazakh part-of-speech tagging

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 17 publications
0
3
0
Order By: Relevance
“…Kazcorpus Kazakh language corpus exceeds 135 million words [23,24] and it contains more than 400.000 documents classified into five major genres:…”
Section: Word2vec Algorithm For Parts Of Speech Determinationmentioning
confidence: 99%
“…Kazcorpus Kazakh language corpus exceeds 135 million words [23,24] and it contains more than 400.000 documents classified into five major genres:…”
Section: Word2vec Algorithm For Parts Of Speech Determinationmentioning
confidence: 99%
“…Although several statistical models have been proposed for Kazakh MD, such as HMM- (Makazhanov et al, 2014;Makhambetov et al, 2015;Assylbekov et al, 2016), voted perceptron- (Tolegen et al, 2016) and transformation-based (Kessikbayeva and Cicekli, 2016) taggers, to our knowledge ours is the first deep learning-based approach to the problem that is also purely language independent.…”
Section: Related Workmentioning
confidence: 99%
“…From technical perspective, there is another challenge that concerns mostly Kazakh in its lack of resources for our particular purposes. By and large the language is being actively studied, and there exist monolingual corpora [6,7], and ongoing research on morphological processing [8][9][10][11][12][13] and syntactic parsing [14][15][16]. However, except for a rather small and noisy OPUS corpus [17] there are no Russian-Kazakh parallel corpora 4 and the only tool for automatic morphological disambiguation of Kazakh available to us 5 was reported to have accuracy of 86%, which we considered to be low enough to question the results of experiments with segmentation: would possible misalignments be shortcomings of a chosen segmentation scheme or results of incorrect morphological analysis and disambiguation.…”
Section: Introductionmentioning
confidence: 99%