2022
DOI: 10.1371/journal.pone.0265633
|View full text |Cite
|
Sign up to set email alerts
|

Entropy-based discrimination between translated Chinese and original Chinese using data mining techniques

Abstract: The present research reports on the use of data mining techniques for differentiating between translated and non-translated original Chinese based on monolingual comparable corpora. We operationalized seven entropy-based metrics including character, wordform unigram, wordform bigram and wordform trigram, POS (Part-of-speech) unigram, POS bigram and POS trigram entropy from two balanced Chinese comparable corpora (translated vs non-translated) for data mining and analysis. We then applied four data mining techn… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 14 publications
(1 citation statement)
references
References 60 publications
0
0
0
Order By: Relevance
“…Future research can use self-designed MDA by including more indices of semantic and syntactic categories. Furthermore, future research may explore other comprehensive information-theoretic metrics, such as entropy (Liu et al, 2022a(Liu et al, , 2022b, to validate existing findings and enhance our understanding of linguistic differences between translated and non-translated chairman's statements.…”
Section: Discussionmentioning
confidence: 97%
“…Future research can use self-designed MDA by including more indices of semantic and syntactic categories. Furthermore, future research may explore other comprehensive information-theoretic metrics, such as entropy (Liu et al, 2022a(Liu et al, , 2022b, to validate existing findings and enhance our understanding of linguistic differences between translated and non-translated chairman's statements.…”
Section: Discussionmentioning
confidence: 97%