2000
DOI: 10.1007/3-540-45715-1_40
|View full text |Cite
|
Sign up to set email alerts
|

Using Syntactic Dependency-Pairs Conflation to Improve Retrieval Performance in Spanish

Abstract: Abstract. This article presents two new approaches for term indexing which are particularly appropriate for languages with a rich lexis and morphology, such as Spanish, and need few resources to be applied. At word level, productive derivational morphology is used to conflate semantically related words. At sentence level, an approximate grammar is used to conflate syntactic and morphosyntactic variants of a given multi-word term into a common base form. Experimental results show remarkable improvements with re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
3
0
3

Year Published

2002
2002
2008
2008

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 7 publications
0
3
0
3
Order By: Relevance
“…A ferramenta segmenta o texto, lematiza e atribui etiquetas morfológicas para palavras e sinais de pontuação, com precisão em torno de 95% [8]. Optamos pela lematização como forma de normalização lingüística porque, para idiomas com morfologia flexional mais complexa como o Português, algoritmos simples de stemming costumam não ser suficientes, além de terem um custo computacional mais alto [16].…”
Section: Seleção Dos Termosunclassified
“…A ferramenta segmenta o texto, lematiza e atribui etiquetas morfológicas para palavras e sinais de pontuação, com precisão em torno de 95% [8]. Optamos pela lematização como forma de normalização lingüística porque, para idiomas com morfologia flexional mais complexa como o Português, algoritmos simples de stemming costumam não ser suficientes, além de terem um custo computacional mais alto [16].…”
Section: Seleção Dos Termosunclassified
“…The kernel of the grammar used by the parser is inferred from the basic trees corresponding to noun phrases and their syntactic and morpho-syntactic variants [6,10]:…”
Section: The Shallow Parsermentioning
confidence: 99%
“…Once the basic trees of noun phrases and their variants have been established, they are compiled into a set of regular expressions, which are matched against the tagged texts in order to extract the dependency pairs, which are used as index terms, as is described in [10]. In this way, we can identify dependency pairs through simple pattern matching over the output of the tagger/lemmatizer, dealing with the problem by means of finite-state techniques, leading to a considerable reduction of the running cost.…”
Section: The Shallow Parsermentioning
confidence: 99%
“…Given a stream of tagged words, the parser module, described in [11], tries to obtain the head-modifier pairs 1 In Spanish, Javier is a traditional first name, Pérez is a traditional family name, del is the resulting of contracting the preposition de (of ) and the definite article el (the), and Río is the common noun river. The use of common nouns as part of a family name (in this case Pérez del Río) is a typical phenomenon in Spanish.…”
Section: The Shallow Parsermentioning
confidence: 99%