Findings of the Association for Computational Linguistics: ACL 2022 2022
DOI: 10.18653/v1/2022.findings-acl.69
|View full text |Cite
|
Sign up to set email alerts
|

Mukayese: Turkish NLP Strikes Back

Abstract: Having sufficient resources for language X lifts it from the under-resourced languages class, but not necessarily from the under-researched class. In this paper, we address the problem of the absence of organized benchmarks in the Turkish language. We demonstrate that languages such as Turkish are left behind the stateof-the-art in NLP applications. As a solution, we present MUKAYESE, a set of NLP benchmarks for the Turkish language that contains several NLP tasks. We work on one or more datasets for each benc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 12 publications
(5 citation statements)
references
References 23 publications
0
5
0
Order By: Relevance
“…Despite relatively lagging progress in active Natural Language Processing (NLP) endeavors focused on Turkish, substantial advancements have been made in recent years. Mukayese stands out as a noteworthy effort, providing a comprehensive benchmarking study for various NLP tasks pertaining to the Turkish language [36]. In [37], an elaborate finite automaton model encompassing Turkish grammar rules alongside developing tools for stemming, morphological labeling, and verb negation was devised in Turkish.…”
Section: Natural Language Processing (Nlp)mentioning
confidence: 99%
“…Despite relatively lagging progress in active Natural Language Processing (NLP) endeavors focused on Turkish, substantial advancements have been made in recent years. Mukayese stands out as a noteworthy effort, providing a comprehensive benchmarking study for various NLP tasks pertaining to the Turkish language [36]. In [37], an elaborate finite automaton model encompassing Turkish grammar rules alongside developing tools for stemming, morphological labeling, and verb negation was devised in Turkish.…”
Section: Natural Language Processing (Nlp)mentioning
confidence: 99%
“…Since the platform has been designed for Turkish, it includes tools that are specific to Turkish such as diacritic restorer. A recently-developed platform is Mukayese (Safaya et al, 2022), which is a benchmarking platform that provides a set of datasets and benchmarks for seven different types of Turkish NLP tasks. The Mukayese platform is also a part of the Turkish Data Depository (TDD) project 5 for building a repository of Turkish NLP resources.…”
Section: Related Workmentioning
confidence: 99%
“…We then test both the out-of-domain and in-domain procedural summarization models. Similarly, we experiment with both language-specific decoder models such as TR-BART (Safaya et al, 2022), and multilingual decoder models such as mBART and mT5 (Xue et al, 2021), described in Appendix A.3. We use the standard ROUGE metrics for evaluation.…”
Section: Summarizationmentioning
confidence: 99%