2022
DOI: 10.20944/preprints202203.0303.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TWIENG: A Multi-Domain Twi-English Parallel Corpus for Machine Translation of Twi, a Low-Resource African Language

Abstract: A Twi-English parallel corpus is certainly an important resource for Machine Translation of Twi (ISO 639-3), a Low-Resource African Language (LRAL) which is mainly spoken in Ghana and Ivory Coast. Currently large-scale multi-domain Twi-English parallel corpus is still unavailable partly due to the difficulties and the arduous efforts required in its design. In this paper, we present TWIENG: a large-scale multi-domain Twi-English parallel corpus. We crawled the sentences from the web using web crawlers, transla… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 9 publications
0
2
0
Order By: Relevance
“…Students have thus been known to complete high-quality assessments without putting any significant effort -in this regard, language lecturers sometimes witness work submissions that contain errors from AI-software-generated tools (Rudolph et al, 2023;Sharma et al, 2022). Moreover, while these AI tools have improved significantly in recent years, they are not always accurate and can produce awkward or nonsensical translations (Afram et al, 2022). In these cases, language students in developing contexts sometimes use AI to cheat by using machine translation tools to translate their assignments from their native language to the target language (Klimova et al, 2023;Shiri, 2023;Straume & Anson, 2022).…”
Section: Academic Dishonesty On Steroidsmentioning
confidence: 99%
“…Students have thus been known to complete high-quality assessments without putting any significant effort -in this regard, language lecturers sometimes witness work submissions that contain errors from AI-software-generated tools (Rudolph et al, 2023;Sharma et al, 2022). Moreover, while these AI tools have improved significantly in recent years, they are not always accurate and can produce awkward or nonsensical translations (Afram et al, 2022). In these cases, language students in developing contexts sometimes use AI to cheat by using machine translation tools to translate their assignments from their native language to the target language (Klimova et al, 2023;Shiri, 2023;Straume & Anson, 2022).…”
Section: Academic Dishonesty On Steroidsmentioning
confidence: 99%
“…The authors of [39] present the Twieng corpus, a small English-Twi parallel corpus with 5,419 sentences. The corpus is based on online news portals, Twi literature, the Ghanaian Parliamentary Hansard, the Twi-English Bible, and Social Media crowdsourcing and has a focus on socio-cultural, educational and legal issues.…”
Section: Parallel Corpora For Twimentioning
confidence: 99%