2022
DOI: 10.1007/978-3-031-14841-5_37
|View full text |Cite
|
Sign up to set email alerts
|

Ukrainian News Corpus as Text Classification Benchmark

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
0
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 7 publications
0
0
0
Order By: Relevance
“…There are many different corpora for training text classification models that have become an important resource for research and applications in the field of natural language processing [17,18]. These corpora represent different types of text data from different sources and cover a wide range of topics.…”
Section: Methodsmentioning
confidence: 99%
“…There are many different corpora for training text classification models that have become an important resource for research and applications in the field of natural language processing [17,18]. These corpora represent different types of text data from different sources and cover a wide range of topics.…”
Section: Methodsmentioning
confidence: 99%
“…The corpus has been scrapped from seven Ukrainian news websites: BBC News Ukraine, NV (New Voice Ukraine), Ukrainian Pravda, Economic Pravda, European Pravda, Life Pravda, and Unian. Ukrainian computer scientists [23] have developed the described corpus. The researchers give an exhaustive outlook on the data preprocessing steps and the number of texts in the train/test split (57789/ 24765, respectively).…”
Section: Methodsmentioning
confidence: 99%
“…The Kaggle platform offers two training splits from the existing sample: large (57460) and small (9299). In their paper, the academics demonstrate their models' performance scores on the two training splits discussed [23]. We are left with the training splits because we cannot use the initial train and test split as it is unavailable to the public.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations