2004
DOI: 10.1007/978-3-540-30463-0_54
|View full text |Cite
|
Sign up to set email alerts
|

Detecting Inflection Patterns in Natural Language by Minimization of Morphological Model

Abstract: Abstract. One of the most important steps in text processing and information retrieval is stemming -reducing of words to stems expressing their base meaning, e.g., bake, baked, bakes, baking → bak-. We suggest an unsupervised method of recognition such inflection patterns automatically, with no a priori information on the given language, basing exclusively on a list of words extracted from a large text. For a given word list V we construct two sets of strings: stems S and endings E, such that each word from V … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0
1

Year Published

2005
2005
2019
2019

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 21 publications
(14 citation statements)
references
References 3 publications
0
13
0
1
Order By: Relevance
“…There have been several works applying GAs [9] to different aspects of information retrieval, and also to the stemming problem [10]. Proposals devoted to the query expansion problem with GAs can be classified into relevance feedback techniques and Inductive Query by Example (IQBE) algorithms.…”
Section: Modelmentioning
confidence: 99%
“…There have been several works applying GAs [9] to different aspects of information retrieval, and also to the stemming problem [10]. Proposals devoted to the query expansion problem with GAs can be classified into relevance feedback techniques and Inductive Query by Example (IQBE) algorithms.…”
Section: Modelmentioning
confidence: 99%
“…Esto permite a los autores el uso de los modelos morfológicos orientados a la generación, en lugar de desarrollar modelos de análisis especiales. En [17] se presenta un algoritmo no supervisado para el stemming de lenguas flexionales. Según los autores, el algoritmo podría aplicarse a lenguajes aglutinantes, con las modificaciones adecuadas.…”
Section: Algoritmos De Stemming Y De Lematizaciónunclassified
“…For the problem of word segmentation, EM is typically applied by first extracting a set of candidate multi-grams from a given training corpus [8], initializing a probability distribution over this set, and then using the standard iteration to adjust the probabilities of the multi-grams to increase the posterior probability of the training data. Somewhat similar tasks of segmenting words into morphemes, where methods use minimal length description were shown to give good results [13].…”
Section: Of Tokens T H E M O S T F a V O U R I T E M U S I C O F A L mentioning
confidence: 99%