Comparing Resources for Spanish Lexical Simplification

Saggion, Horacio; Bott, Stefan; Rello, Luz

doi:10.1007/978-3-642-39593-2_21

Cited by 6 publications

(3 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is an online lexical database containing terms and definitions, as well as inter-word semantic relations such as hypernyms, hyponyms, synonyms, and antonyms [19]. WordNet provides 128,391 word-sense definitions in English and is also available in Spanish albeit in a less complete form [20]. Since WordNet is not a medical resource, many of its explanations are not optimal for medical text simplification and when several senses are provided for a word, it is not always clear which best suits the medical sense.…”

Section: Resources For Finding Explanations For Difficulty Termsmentioning

confidence: 99%

Improving Consumer Understanding of Medical Text: Development and Validation of a New SubSimplify Algorithm to Automatically Generate Term Explanations in English and Spanish (Preprint)

Kloehn¹,

Leroy²,

Kauchak³

et al. 2018

Preprint

View full text Add to dashboard Cite

ResultsSubSimplify results in quality scores of 1.64 English (P<.001) and 1.49 Spanish ( P<.001), which is lower than that of existing resources (CHV=2.81). However, in Coverage SubSimplify outperforms all existing written resources; increasing the coverage from 53.0%-80.5% in English, and 20.8%-90.8% in Spanish (P<.001). This result means that the usefulness score of SubSimplify (1.32) (P<.001) is greater than most existing resources at (CHV=0.169). ConclusionsOur approach is intended as an additional resource to existing, manually created resources. It greatly increases the number of difficult terms for which an easier alternative can be made available resulting in greater actual usefulness.

show abstract

Section: Resources For Finding Explanations For Difficulty Termsmentioning

confidence: 99%

Improving Consumer Understanding of Medical Text: Development and Validation of a New SubSimplify Algorithm to Automatically Generate Term Explanations in English and Spanish (Preprint)

Kloehn¹,

Leroy²,

Kauchak³

et al. 2018

Preprint

View full text Add to dashboard Cite

show abstract

“…The closest algorithm to ours is LexSiS (Bott et al, 2012;Saggion et al, 2013), that uses the Spanish OpenThesaurus and a corpus that contains 6,595 words of original and 3,912 words of manually simplified news articles. To the best of our knowledge this is the first and only lexical simplification algorithm for Spanish.…”

Section: Related Workmentioning

confidence: 99%

CASSA: A Context-Aware Synonym Simplification Algorithm

Baeza-Yates

Rello

Dembowski

2015

Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

Self Cite

View full text Add to dashboard Cite

We present a new context-aware method for lexical simplification that uses two free language resources and real web frequencies. We compare it with the state-of-the-art method for lexical simplification in Spanish and the established simplification baseline, that is, the most frequent synonym. Our method improves upon the other methods in the detection of complex words, in meaning preservation, and in simplicity. Although we use Spanish, the method can be extended to other languages since it does not require alignment of parallel corpora.

show abstract

“…It is an online lexical database containing terms and definitions, as well as interword semantic relations such as hypernyms, hyponyms, synonyms, and antonyms [ 19 ]. WordNet provides 128,391 word-sense definitions in English and is also available in Spanish, albeit in a less complete form [ 20 ]. Since WordNet is not a medical resource, many of its explanations are not optimal for medical text simplification, and when several senses are provided for a word, it is not always clear which best suits the medical sense.…”

Section: Introductionmentioning

confidence: 99%

Improving Consumer Understanding of Medical Text: Development and Validation of a New SubSimplify Algorithm to Automatically Generate Term Explanations in English and Spanish

Kloehn¹,

Leroy²,

Kauchak³

et al. 2018

J Med Internet Res

View full text Add to dashboard Cite

BackgroundWhile health literacy is important for people to maintain good health and manage diseases, medical educational texts are often written beyond the reading level of the average individual. To mitigate this disconnect, text simplification research provides methods to increase readability and, therefore, comprehension. One method of text simplification is to isolate particularly difficult terms within a document and replace them with easier synonyms (lexical simplification) or an explanation in plain language (semantic simplification). Unfortunately, existing dictionaries are seldom complete, and consequently, resources for many difficult terms are unavailable. This is the case for English and Spanish resources.ObjectiveOur objective was to automatically generate explanations for difficult terms in both English and Spanish when they are not covered by existing resources. The system we present combines existing resources for explanation generation using a novel algorithm (SubSimplify) to create additional explanations.MethodsSubSimplify uses word-level parsing techniques and specialized medical affix dictionaries to identify the morphological units of a term and then source their definitions. While the underlying resources are different, SubSimplify applies the same principles in both languages. To evaluate our approach, we used term familiarity to identify difficult terms in English and Spanish and then generated explanations for them. For each language, we extracted 400 difficult terms from two different article types (General and Medical topics) balanced for frequency. For English terms, we compared SubSimplify’s explanation with the explanations from the Consumer Health Vocabulary, WordNet Synonyms and Summaries, as well as Word Embedding Vector (WEV) synonyms. For Spanish terms, we compared the explanation to WordNet Summaries and WEV Embedding synonyms. We evaluated quality, coverage, and usefulness for the simplification provided for each term. Quality is the average score from two subject experts on a 1-4 Likert scale (two per language) for the synonyms or explanations provided by the source. Coverage is the number of terms for which a source could provide an explanation. Usefulness is the same expert score, however, with a 0 assigned when no explanations or synonyms were available for a term.ResultsSubSimplify resulted in quality scores of 1.64 for English (P<.001) and 1.49 for Spanish (P<.001), which were lower than those of existing resources (Consumer Health Vocabulary [CHV]=2.81). However, in coverage, SubSimplify outperforms all existing written resources, increasing the coverage from 53.0% to 80.5% in English and from 20.8% to 90.8% in Spanish (P<.001). This result means that the usefulness score of SubSimplify (1.32; P<.001) is greater than that of most existing resources (eg, CHV=0.169).ConclusionsOur approach is intended as an additional resource to existing, manually created resources. It greatly increases the number of difficult terms for which an easier alternative can be made available,...

show abstract

Comparing Resources for Spanish Lexical Simplification

Cited by 6 publications

References 18 publications

Improving Consumer Understanding of Medical Text: Development and Validation of a New SubSimplify Algorithm to Automatically Generate Term Explanations in English and Spanish (Preprint)

Improving Consumer Understanding of Medical Text: Development and Validation of a New SubSimplify Algorithm to Automatically Generate Term Explanations in English and Spanish (Preprint)

CASSA: A Context-Aware Synonym Simplification Algorithm

Improving Consumer Understanding of Medical Text: Development and Validation of a New SubSimplify Algorithm to Automatically Generate Term Explanations in English and Spanish

Contact Info

Product

Resources

About