ParaMor: Finding Paradigms across Morphology

Monson, Christian; Carbonell, Jaime G.; Lavie, Alon; Levin, Lori

doi:10.1007/978-3-540-85760-0_115

Cited by 17 publications

(22 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Such languages have a complex morphology, which cannot be covered by hand-made lexical resources. Some studies aim at solving this problem by learning inflectional paradigms from raw text corpora by clustering word forms in the corpus and analyzing the resulting clusters ( [9,8,3]). Other unsupervised methods applied to morphology induction are that of [15], [6] and [5], the latter using morphemes to encode a corpus by grouping morphemes into structures, called signatures, representing inflectional paradigms.…”

Section: Related Workmentioning

confidence: 99%

Making Morphologies the “Easy” Way

Novák

2015

Computational Linguistics and Intelligent Text Processing

View full text Add to dashboard Cite

1 w eE ui rung ri n v ngu ge e hnology ese r h qroupD 2 p ulty of snform tion e hnology nd fioni s ¡ zm¡ ny ¡ eter g tholi niversity SHG r¡ ter streetD IHVQ fud pestD rung ry novak.attila@itk.ppke.huAbstract. gomput tion l morphologies often onsist of lexi on nd some rule omponentD the re tion of whi h requires v rious ompeten es nd onsider le e'ortF u h des riptionD on the other h ndD m kes n e sy extension of the morphology with new lexi l items possi leF wost freely v il le morphologi l resour esD howeverD ont in no rule ompoE nentF hey re usu lly sed on just morphologi l lexi onD ont ining se forms nd some inform tion @often just p r digm shA identifying the in)e tion l p r digm of the wordD possi ly ugmented with some other morphosynt ti fe turesF he im of the rese r h presented in this p per w s to re te n lgorithm th t m kes the integr tion of new words into su h resour es simil rly e sy to the w y ruleE sed morpholE ogy n e extendedF his is hieved y predi ting the orre t p r digm for words not present in the lexi onF he supervised m hine le rning lE gorithm des ri ed in this p per is sed on longest m t hing su0xes nd lexi l frequen y d t D nd is demonstr ted nd ev lu ted for ussi nF Keywords: morphologyD p r digm predi tionD ussi n

show abstract

Section: Related Workmentioning

confidence: 99%

Making Morphologies the “Easy” Way

Novák

2015

Computational Linguistics and Intelligent Text Processing

View full text Add to dashboard Cite

show abstract

“…The baseline ParaMor algorithm which we extend here competed in the English and German tracks of Morpho Challenge 2007 (Monson et al, 2007b). The peer operated competitions of the Morpho Challenge series standardize the evaluation of unsupervised morphology induction algorithms (Kurimo et al, 2007a;.…”

Section: Unsupervised Morphology Inductionmentioning

confidence: 99%

“…Although Goldsmith (2001) and Goldsmith and Hu (2004) discuss ideas for segmenting individual words into more than two morphemes, the implemented Linguistica algorithm, as presented in Goldsmith (2006), permits at most a single morpheme boundary in each word. Second, ParaMor decouples the task of paradigm identification from that of word segmentation (Monson et al, 2007b). In contrast, morphology models in Linguistica inherently encode both a belief about paradigm structure on individual words as well as a segmentation of those words.…”

Section: Related Workmentioning

confidence: 99%

“…Indeed, in our random sample of 100 schemes, 51 of the 59 schemes with morpheme boundary errors incorrectly hypothesized a boundary stem-internally. For this reason, the baseline ParaMor algorithm already discarded schemes that likely misplace a boundary steminternally (Monson et al, 2007b). Although there are fewer schemes that misplace a morpheme boundary suffix-internally, suffix-internal error schemes contain short suffixes that can generalize to segment a large number of word forms.…”

Section: Correcting Morpheme Boundary Errorsmentioning

confidence: 99%

See 1 more Smart Citation

Evaluating an agglutinative segmentation model for ParaMor

Monson

Lavie

Carbonell

et al. 2008

Proceedings of the Tenth Meeting of ACL Special Interest Group on Computational Morphology and Phonology - SigMorPhon '08

Self Cite

View full text Add to dashboard Cite

We study a transfer learning framework where source and target datasets are heterogeneous in both feature and label spaces. Specifically, we do not assume explicit relations between source and target tasks a priori, and thus it is crucial to determine what and what not to transfer from source knowledge. Towards this goal, we define a new heterogeneous transfer learning approach that (1) selects and attends to an optimized subset of source samples to transfer knowledge from, and (2) builds a unified transfer network that learns from both source and target knowledge. This method, termed "Attentional Heterogeneous Transfer", along with a newly proposed unsupervised transfer loss, improve upon the previous state-of-the-art approaches on extensive simulations as well as a challenging hetero-lingual text classification task.

show abstract

“…In addition, there are 7 new papers on this topic by Bernhard (2007), Bordag (2007), Chan (2007), McNamee (2007), Monson et al (2007), Pitler and Keshava (2007), and Tepper (2007). The approach described in this paper is a direct extension of Zeman (2007) and we will frequently refer to him.…”

Section: Introductionmentioning

confidence: 99%

Using Unsupervised Paradigm Acquisition for Prefixes

Zeman¹

2009

Lecture Notes in Computer Science

View full text Add to dashboard Cite

We describe a simple method of unsupervised morpheme segmentation of words in an unknown language. All what is needed is a raw text corpus (or a list of words) in the given language. The algorithm identifies word parts occurring in many words and interprets them as morpheme candidates (prefixes, stems and suffixes). New treatment of prefixes is the main innovation over Zeman (2007). After filtering out spurious hypotheses, the list of morphemes is applied to segment input words. Official Morpho Challenge 2008 evaluation is given along with some additional experiments evaluated unofficially. We also analyze and discuss errors with respect to the evaluation method.

show abstract

ParaMor: Finding Paradigms across Morphology

Cited by 17 publications

References 9 publications

Making Morphologies the “Easy” Way

Making Morphologies the “Easy” Way

Evaluating an agglutinative segmentation model for ParaMor

Using Unsupervised Paradigm Acquisition for Prefixes

Contact Info

Product

Resources

About