2020
DOI: 10.1007/s41870-020-00480-2
|View full text |Cite
|
Sign up to set email alerts
|

Using machine learning to build POS tagger for under-resourced language: the case of Somali

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 6 publications
1
8
0
Order By: Relevance
“…Finally, it is decided that every remaining unknown (UNK) word is annotated as noun (N). This result confirms the findings in [4,11,21], which were proposed for similiar language family like afaan Oromo and Somali language.…”
Section: B Algorithm For Implementing Hmmsupporting
confidence: 92%
See 3 more Smart Citations
“…Finally, it is decided that every remaining unknown (UNK) word is annotated as noun (N). This result confirms the findings in [4,11,21], which were proposed for similiar language family like afaan Oromo and Somali language.…”
Section: B Algorithm For Implementing Hmmsupporting
confidence: 92%
“…Although, it was proposed to develop POS tagging model for resource rich languages like English and French in many research works [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20], much less attention was given to under-resource languages like Shekkinoono. And several methodologies were explored to develop part of speech tagging for Shekki'noono language [5,[11][12][13][14][15]. All of the sources which have been explored are to support as input for the Shekki'noono language tagger.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…The current study extends these recent acoustic inquiries to examining applications of machine learning algorithms to describing the acoustic characteristics of allophonic systems operating within an under-resourced, endangered language spoken in Central America. Although machine learning technologies have been applied to data from endangered languages primarily for purposes of automatic speech recognition in the past (Besacier, Barnard, Karpov, and Schultz 2014;Rey and Nagy 2018;Mohammed 2020), the current study aims to use a machine learning algorithm to generate clusters of acoustic similarity for vocalic productions to explore patterns of phonologically-conditioned allophonic splits in Tol.…”
mentioning
confidence: 99%