2007
DOI: 10.1075/term.13.2.06viv
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of terms and term extraction systems

Abstract: Term extraction may be defined as a text mining activity whose main purpose is to obtain all the terms included in a text of a given domain. Since the eighties, and mainly due to the rapid scientific advances as well as the evolution of the communication systems, there has been a growing interest in obtaining the terms found in written documents. A number of techniques and strategies have been proposed for satisfying this requirement. At present it seems that term extraction has reached a maturity stage. Never… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
30
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 54 publications
(31 citation statements)
references
References 9 publications
1
30
0
Order By: Relevance
“…In the extraction of medicine terms and correlated areas, for instance, it is common to identify morphemes, whether radical or affix morphemes, with a Greek or Latin origin, such as stated by Vivaldi and Rodriguez [15] in 'artri/o-' ('arthr(o)-'), from the Greek 'arthros', in 'artrite' ('arthritis'). In ATE, it is possible to identify candidate terms from the domains of the nanoscience and nanotechnology based on the identification of morphemes such as nano-, as this term composes several simple terms (for example:…”
Section: The Linguistic Approachmentioning
confidence: 99%
See 2 more Smart Citations
“…In the extraction of medicine terms and correlated areas, for instance, it is common to identify morphemes, whether radical or affix morphemes, with a Greek or Latin origin, such as stated by Vivaldi and Rodriguez [15] in 'artri/o-' ('arthr(o)-'), from the Greek 'arthros', in 'artrite' ('arthritis'). In ATE, it is possible to identify candidate terms from the domains of the nanoscience and nanotechnology based on the identification of morphemes such as nano-, as this term composes several simple terms (for example:…”
Section: The Linguistic Approachmentioning
confidence: 99%
“…Concerning ATE based on texts in Brazilian Portuguese, the best results also reach an 80% precision [9], when using the hybrid extractor EχATOLP [19]. The hybrid extractor YATE [15], proposed for the extraction of candidates from the Spanish corpora on the medicine domain, is one of the few in the literature that reaches a 98% precision rate. This extractor is characterized by the use of varied linguistic knowledge, such as morphological, syntactic, and semantic, together with statistical measures, considerably improving the extraction results.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…There are also studies that consider different fixed values of candidates [2,[12][13][14]. Other studies explore a variation of result combinations (precision and recall, usually) [5,15]. There are studies that compare some measures used to extract terms [14,16,17].…”
Section: Related Workmentioning
confidence: 99%
“…These methods exploit data-centric, data-sensitive techniques for mining and organizing terms. Evaluation of these methods-as described in Vivaldi and Rodrïguez (2007) and Nazarenko and Zargayouna (2009)-is inherently a difficult task. Regardless of the employed metric and method for the performance comparison of CT algorithms, however, choosing a shared dataset consisting of a fixed set of documents-which can be accessed freely and easily-is a major step towards alleviating a number of obstacles in the evaluation process.…”
Section: Introductionmentioning
confidence: 99%