2021
DOI: 10.1007/s13369-020-05258-z
|View full text |Cite
|
Sign up to set email alerts
|

An Arabic Multi-source News Corpus: Experimenting on Single-document Extractive Summarization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(6 citation statements)
references
References 35 publications
0
6
0
Order By: Relevance
“…(7) Text Summarisation. This task includes five publicly available datasets, including both Arabic and multilingual data: MassiveSum (Varab and Schluter, 2021), XLSum Hasan et al (2021), Cross-Sum (Bhattacharjee et al, 2021), ANT (Chouigui et al, 2021), and MarSum (Gaanoun et al, 2022).…”
Section: Unlabled Datamentioning
confidence: 99%
“…(7) Text Summarisation. This task includes five publicly available datasets, including both Arabic and multilingual data: MassiveSum (Varab and Schluter, 2021), XLSum Hasan et al (2021), Cross-Sum (Bhattacharjee et al, 2021), ANT (Chouigui et al, 2021), and MarSum (Gaanoun et al, 2022).…”
Section: Unlabled Datamentioning
confidence: 99%
“…Text summarization research has been around since 1958 [17], where the problem was approached considering the frequency of terms in a document. As tools and techniques developed, recent studies applied various methods, from word and phrase counts to deep learning architectures and graph-based methods.…”
Section: Related Workmentioning
confidence: 99%
“…Chouigui et al [16] created an Arabic News Texts (ANT) dataset for text summarization consisting of news articles collected from five sources: Al-Arabiya, BBC, CNN, France24, and SkyNews. The study applied some of the techniques used for ATS, including LexRank, TextRank, Luhn keyword-based method [17], and Latent Semantic Analysis (LSA) to the ANT dataset. The results show that the LexRank technique provides the best results, with a BLEU score of 0.690 and a ROUGE score reaching 0.972.…”
Section: Related Workmentioning
confidence: 99%
“…Concerted efforts have been made to overcome those challenges by building various Arabic datasets for the task such that EASC (El-Haj et al, 2010), Kalimat(El-Haj and Koulali, 2013), TAC2011 (El-Ghannam and El-Shishtawy, 2014), ANT (Chouigui et al, 2021), and XL-Sum (Hasan et al, 2021), but those datasets have limitations in terms of diversity or size. Therefore, the demand for a diverse and large-scale dataset is crucial to advance the ATS field.…”
Section: Introductionmentioning
confidence: 99%
“…Arabic News Texts Corpus (ANT) and XL-Sum. ANT (Chouigui et al, 2021), and XL-Sum (Hasan et al, 2021) are the most recent works. ANT collected 31,798 documents paired with summaries using RSS feeds from 5 Arab news sources: AlArabiya, BBC, CNN, France24, and SkyNews, while XL-Sum collected 40,327 only from BBC.…”
Section: Introductionmentioning
confidence: 99%