Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology 2022
DOI: 10.18653/v1/2022.sigmorphon-1.11
|View full text |Cite
|
Sign up to set email alerts
|

The SIGMORPHON 2022 Shared Task on Morpheme Segmentation

Abstract: The SIGMORPHON 2022 shared task on morpheme segmentation challenged systems to decompose a word into a sequence of morphemes and covered most types of morphology: compounds, derivations, and inflections. Subtask 1, word-level morpheme segmentation, covered 5 million words in 9 languages (Czech, English, Spanish, Hungarian, French, Italian, Russian, Latin, Mongolian) and received 13 system submissions from 7 teams and the best system averaged 97.29% F1 score across all languages, ranging English (93.84%) to Lat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 11 publications
(17 citation statements)
references
References 38 publications
0
17
0
Order By: Relevance
“…We demonstrate that subword-level modeling does work for morpheme segmentation through our submissions to the SIGMORPHON 2022 Shared Task on Morpheme Segmentation (Batsuren et al, 2022). Our subword-level model, an entmax transformer with sampled ULM tokenizations, outperforms our character-level submissions and wins the word-level subtask.…”
Section: Introductionmentioning
confidence: 76%
See 4 more Smart Citations
“…We demonstrate that subword-level modeling does work for morpheme segmentation through our submissions to the SIGMORPHON 2022 Shared Task on Morpheme Segmentation (Batsuren et al, 2022). Our subword-level model, an entmax transformer with sampled ULM tokenizations, outperforms our character-level submissions and wins the word-level subtask.…”
Section: Introductionmentioning
confidence: 76%
“…With the goal of advancing research in this direction, we present a morpheme segmentation shared task and provide large-scale datasets over nine languages, evaluation metrics, and morphological annotations of five million word formations. In this, we rely on the latest release of UniMorph (Batsuren et al, 2022) which has introduced morpheme segmentations and derivational data from MorphyNet (Batsuren et al, 2021b). The resulting shared task is a follow-up to past morphological segmentation shared tasks such as "MorphoChallenge" (Kurimo et al, 2007(Kurimo et al, , 2008(Kurimo et al, , 2009 or "Multilingual parsing" (Zeman et al, 2017, where lemmatization as segmentation is a subtask).…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations