2022
DOI: 10.48550/arxiv.2202.09509
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PETCI: A Parallel English Translation Dataset of Chinese Idioms

Abstract: Idioms are an important language phenomenon in Chinese, but idiom translation is notoriously hard. Current machine translation models perform poorly on idiom translation, while idioms are sparse in many translation datasets. We present PETCI, a parallel English translation dataset of Chinese idioms, aiming to improve idiom translation by both human and machine. The dataset is built by leveraging human and machine effort. Baseline generation models show unsatisfactory abilities to improve translation, but struc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 26 publications
0
2
0
Order By: Relevance
“…In the field of idiomaticity, prior works have focused on detecting idioms (Tayyar Madabushi et al, 2021;Tan and Jiang, 2021;, paraphrasing idiomatic sentences to literal paraphrases (Zhou et al, 2022), cloze task such as fill-in-the-blank language comprehension (Zheng et al, 2019), classifying idiomatic and literal expressions (Peng et al, 2015), translating idiomatic language (Tang, 2022), and generating continuations for idiomatic contexts (Chakrabarty et al, 2022).…”
Section: Introductionmentioning
confidence: 99%
“…In the field of idiomaticity, prior works have focused on detecting idioms (Tayyar Madabushi et al, 2021;Tan and Jiang, 2021;, paraphrasing idiomatic sentences to literal paraphrases (Zhou et al, 2022), cloze task such as fill-in-the-blank language comprehension (Zheng et al, 2019), classifying idiomatic and literal expressions (Peng et al, 2015), translating idiomatic language (Tang, 2022), and generating continuations for idiomatic contexts (Chakrabarty et al, 2022).…”
Section: Introductionmentioning
confidence: 99%
“…In the field of idiomaticity, prior works have focused on detecting idioms (Tayyar Madabushi et al, 2021;Tan and Jiang, 2021;, paraphrasing idiomatic sentences to literal paraphrases , cloze task such as fill-in-the-blank language comprehension (Zheng et al, 2019), classifying idiomatic and literal expressions (Peng et al, 2015), translating idiomatic language (Tang, 2022), and generating continuations for idiomatic contexts (Chakrabarty et al, 2022).…”
Section: Introductionmentioning
confidence: 99%