JDHASA 2021
DOI: 10.55492/dhasa.v3i03.3818
|View full text |Cite
|
Sign up to set email alerts
|

Canonical Segmentation and Syntactic Morpheme Tagging of Four Resource- scarce Nguni Languages

Abstract: Morphological analysis involves investigating the syntactic class of a word but can also extend to the decomposition and syntactic analysis of its underlying morpheme composition. This is especially relevant to languages with an agglutinative writing system where multiple linguistic words are expressed as a single orthographic word. In this paper, we propose a memory-based approach to canonical segmentation using a windowing approach to recover the uncondensed morphemes that differ from the surface form of a w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 3 publications
0
3
0
Order By: Relevance
“…Stemming or lematizing and parts of speech (PoS) tagging are NLP approaches used to solve this challenge in well-resourced languages. However, these tools are still being developed for isiXhosa (although advances are be-ing made (Mzamo et al, 2015;Puttkammer and Toit, 2021)) and other under-resourced languages. This paper applies one approach which can bypass the need for these NLP tools.…”
Section: Analysing Collective Concept Formation With Frequency Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…Stemming or lematizing and parts of speech (PoS) tagging are NLP approaches used to solve this challenge in well-resourced languages. However, these tools are still being developed for isiXhosa (although advances are be-ing made (Mzamo et al, 2015;Puttkammer and Toit, 2021)) and other under-resourced languages. This paper applies one approach which can bypass the need for these NLP tools.…”
Section: Analysing Collective Concept Formation With Frequency Analysismentioning
confidence: 99%
“…word embeddings or topic models), which themselves are not built to deal with the range of variations created by the prefix, infix and suffix structure of agglutinative languages. Such tools are being developed by isiXhosa computational linguists (Mzamo et al, 2015;Puttkammer and Toit, 2021), but are not yet sufficiently advanced to be used for social science or humanities inquiry.…”
Section: Introductionmentioning
confidence: 99%
“…word embeddings or topic models), which themselves are not built to deal with the range of variations created by the prefix, infix and suffix structure of agglutinative languages. Such tools are being developed by isiXhosa computational linguists (Mzamo et al, 2015;Puttkammer and Toit, 2021), but are not yet sufficiently advanced to be used for social science or humanities inquiry.…”
Section: Introductionmentioning
confidence: 99%