2022
DOI: 10.1007/978-3-030-99739-7_46
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Simplification of Scientific Texts: SimpleText Lab at CLEF-2022

Abstract: The Web and social media have become the main source of information for citizens, with the risk that users rely on shallow information in sources prioritizing commercial or political incentives rather than the correctness and informational value. Non-experts tend to avoid scientific literature due to its complex language or their lack of prior background knowledge. Text simplification promises to remove some of these barriers. The CLEF 2022 SimpleText track addresses the challenges of text simplification appro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 34 publications
0
7
0
Order By: Relevance
“…Formal processes of text simplification are varied (Siddharthan, 2014 ; François, 2018 ; Garbacea et al, 2021 ; Ermakova et al, 2022 ). In this regard, Garbacea et al ( 2021 ) emphasize the persistent ambiguity of this phrase, which can refer to different linguistic levels, that is, lexical, syntactic, and semantic, and the way it is conducted, whether manual or automatic, etc.…”
Section: Discussion: Text Simplification and ...mentioning
confidence: 99%
“…Formal processes of text simplification are varied (Siddharthan, 2014 ; François, 2018 ; Garbacea et al, 2021 ; Ermakova et al, 2022 ). In this regard, Garbacea et al ( 2021 ) emphasize the persistent ambiguity of this phrase, which can refer to different linguistic levels, that is, lexical, syntactic, and semantic, and the way it is conducted, whether manual or automatic, etc.…”
Section: Discussion: Text Simplification and ...mentioning
confidence: 99%
“…However, researchers have studied text rewriting in IR for personalization and text simplification. While text simplification has been shown to improve readability and understanding in medical [24] and scientific texts [12], it is usually done by swapping relatively unfamiliar words with more common alternative words [24] or leveraging large-scale language models for complete rewriting of the text [34]. In this setting, a certain degree of information distortion is acceptable, as the text rewritten with such methods might differ from the original due to word substitutions.…”
Section: Conversational Information Seekingmentioning
confidence: 99%
“…Train dataset. For this task, data is two-fold: Medicine and Computer Science, as these two domains are the most popular on forums like ELI5 [12,29]. As in 2021, for Computer Science, we use scientific abstracts from the Citation Network Dataset: DBLP+Citation, ACM Citation network (12th version) 7 [10].…”
Section: Evaluation Frameworkmentioning
confidence: 99%
“…Train dataset As for Task 2: What is unclear?, we provided a parallel corpus of simplified sentences from two domains: Medicine and Computer Science (see Section 4.1). As previously, we use scientific abstracts from the DBLP Citation Network Dataset for Computer Science and Google Scholar and PubMed articles on muscle hypertrophy and health Medicine [10,12].…”
Section: Evaluation Frameworkmentioning
confidence: 99%
See 1 more Smart Citation