2015
DOI: 10.5121/ijnlc.2015.4204
|View full text |Cite
|
Sign up to set email alerts
|

Kamba Part of Speech Tagger Using Memory Based Approach

Abstract: Part of speech tagging is very important and the initial work towards machine translation and text manipulation. Though much has been done in this regard to the Indo-European and Asiatic languages, development of part of speech tagging tools for African languages is wanting. As a result, these languages are classified as under resourced languages. This paper presents data driven part of speech tagging tools for kikamba which is an under resourced language spoken mostly in Machakos, Makueni and Kitui. The tool … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 1 publication
0
3
0
Order By: Relevance
“…Annotators also need some training in the use of these tools. In such scenarios and as an alternative, POS annotation can be done using spreadsheets as was done for some low resource languages of Kenya such as Kikamba (Kituku et al, 2015;Pauw et al, 2006). In this research, we adopted the use of spreadsheets as the project duration was short and it would have taken longer to train the linguistic annotators on the use of a toolkit such as GATE (Cunningham, 2002).…”
Section: Creation Of Annotation Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…Annotators also need some training in the use of these tools. In such scenarios and as an alternative, POS annotation can be done using spreadsheets as was done for some low resource languages of Kenya such as Kikamba (Kituku et al, 2015;Pauw et al, 2006). In this research, we adopted the use of spreadsheets as the project duration was short and it would have taken longer to train the linguistic annotators on the use of a toolkit such as GATE (Cunningham, 2002).…”
Section: Creation Of Annotation Datasetsmentioning
confidence: 99%
“…POS annotations require tagsets. Different languages tend to have different tags e.g., the Kiswahili tagset (Hurskainen, 2016), Kamba tagset (Kituku et al, 2015). This realization therefore leads to the need for some middle ground tagset, such as the universal target set proposed by Petrov et al (2011).…”
Section: Creation Of Annotation Datasetsmentioning
confidence: 99%
“…The concord for possessive pronouns, morph phonological changes in adjectives and verbs, the morphology of compound Nouns and adjectives are yet to be done. With respect to language resource tools, there are only two language tools for this language to the best of our knowledge-these are a Part of Speech tagger and a named entity recognizer [10] [11].…”
Section: Introductionmentioning
confidence: 99%