Proceedings of the 38th Annual Meeting on Association for Computational Linguistics - ACL '00 2000
DOI: 10.3115/1075218.1075245
|View full text |Cite
|
Sign up to set email alerts
|

Minimally supervised morphological analysis by multimodal alignment

Abstract: This paper presents a corpus-based algorithm capable of inducing inflectional morphological analyses of both regular and highly irregular forms (such as brought→bring) from distributional patterns in large monolingual text with no direct supervision. The algorithm combines four original alignment models based on relative corpus frequency, contextual similarity, weighted string similarity and incrementally retrained inflectional transduction probabilities. Starting with no paired examples for … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
111
0
1

Year Published

2009
2009
2017
2017

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 117 publications
(113 citation statements)
references
References 13 publications
1
111
0
1
Order By: Relevance
“…Various similarity measures are often used to identify such pairs, including orthographic and context similarity, or a combination of both (Yarowsky and Wicentowski, 2000;Baroni et al, 2002;Kirschenbaum, 2013). The first approach explicitly mentioning Whole Word Morphology as lin- guistic foundation has been presented by (Neuvel and Fulop, 2002), followed by a graph clustering method (Janicki, 2013) and a graph-based probabilistic model (Janicki, 2015).…”
Section: Related Workmentioning
confidence: 99%
“…Various similarity measures are often used to identify such pairs, including orthographic and context similarity, or a combination of both (Yarowsky and Wicentowski, 2000;Baroni et al, 2002;Kirschenbaum, 2013). The first approach explicitly mentioning Whole Word Morphology as lin- guistic foundation has been presented by (Neuvel and Fulop, 2002), followed by a graph clustering method (Janicki, 2013) and a graph-based probabilistic model (Janicki, 2015).…”
Section: Related Workmentioning
confidence: 99%
“…Except these unsupervised methods, there have been other approaches requiring additional information or selective input. Yarowsky and Wicentowski (2000) proposed to use labeled corpus to train a supervised method for transforming pasttense in English. Rogati et al (2003) introduced a stemming model based on statistical machine translation for Arabic.…”
Section: Related Workmentioning
confidence: 99%
“…We currently use only orthographic features. They are used in a similar manner in [10], but our model needs less supervision and allows concatenative morphology, rather than only stem-suffix pairs. Maybe the closest work to ours is presented in [4].…”
Section: Introductionmentioning
confidence: 99%
“…We currently ignore allomorphic variation in suffixes. Information sources used in literature are orthographic similarity, word frequencies [10] and similar word contexts [9,1]. We currently use only orthographic features.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation