19th IEEE International Conference on Tools With Artificial Intelligence(ICTAI 2007) 2007
DOI: 10.1109/ictai.2007.41
|View full text |Cite
|
Sign up to set email alerts
|

Using the Levenshtein Edit Distance for Automatic Lemmatization: A Case Study for Modern Greek and English

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0
3

Year Published

2008
2008
2022
2022

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 48 publications
0
6
0
3
Order By: Relevance
“…This can be decided with the help of the Levenshtein distance. 89 This method from computational linguistics measures the similarity between two words and is often used in dialectology to study phonetic, and thus dialectal, variation. 90 It basically counts the number of deletions, insertions or substitutions needed to transform one string into another: e.g.…”
Section: The Future Of Onomastic Researchmentioning
confidence: 99%
“…This can be decided with the help of the Levenshtein distance. 89 This method from computational linguistics measures the similarity between two words and is often used in dialectology to study phonetic, and thus dialectal, variation. 90 It basically counts the number of deletions, insertions or substitutions needed to transform one string into another: e.g.…”
Section: The Future Of Onomastic Researchmentioning
confidence: 99%
“…The textual analysis phase consisted of three activities: (a) removal of stop words (i.e. articles, special characters, etc), (b) lemmatization of words using a Levenshtein distance based Greek lemmatizer [14], (c) removal of terms appearing less than 30 times within the complete article corpus and taking the 150 most frequent of them. Upon completion of the aforementioned phases, we kindly asked a domain expert (financial journalist) to annotate terms according to their genre.…”
Section: Experimental Design and Evaluationmentioning
confidence: 99%
“…En [21] y [22] el lector puede hallar un estado del arte muy completo al respecto. Ejemplos de este tipo de análisis son la comparación de grafos [20], la utilización de n-gramas [16,34], la búsqueda de analogías [32], los modelos superficiales a base de reglas [38,31], los modelos probabilísticos [12], la segmentación por optimización [11,19], el aprendizaje no supervisado de las familias morfológicas por clasificación jerárquica ascendente [7], la lematización usando distancias de Levenshtein [14] o la identificación de sufijos por medio de la entropía [42]. Estos métodos se distinguen por el tipo de resultados obtenidos, ya sea la identificación de lemas, stems o sufijos.…”
Section: Algoritmos De Stemming Y De Lematizaciónunclassified
“…Nuestro algoritmo, capaz de procesar listas independientemente del idioma, está mejor posicionado (95,8 y 96,8 respectivamente en francés e inglés), comparable al algoritmo de la distancia de Levenshtein presentado en [14] (que logra un 96 % de precisión en inglés). Estudios como el de [35] muestran que el regrupamiento de variantes morfológicas, similar al nuestro en el sentido de no necesitar ni conocimientos a priori de la lengua ni recursos externos, presenta un interés superior al del stemming o al de la lematización, en tareas específicas de RI como el query expansion.…”
Section: Evaluaciónunclassified
See 1 more Smart Citation