2022
DOI: 10.1016/j.jksuci.2020.04.009
|View full text |Cite
|
Sign up to set email alerts
|

Identifying cross-lingual plagiarism using rich semantic features and deep neural networks: A study on Arabic-English plagiarism cases

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 76 publications
0
9
0
Order By: Relevance
“…[115]. Plagiarism detection is the process of identifying all forms of cheating in scientific papers in the academic field, whether intentionally or unintentionally [61]. The taxonomy of plagiarism detection is divided into two, namely crosslingual and mono-lingual, where the characteristic of crosslingual is the presence of language differences between the source text and the suspicious text.…”
Section: Text Plagiarismmentioning
confidence: 99%
See 1 more Smart Citation
“…[115]. Plagiarism detection is the process of identifying all forms of cheating in scientific papers in the academic field, whether intentionally or unintentionally [61]. The taxonomy of plagiarism detection is divided into two, namely crosslingual and mono-lingual, where the characteristic of crosslingual is the presence of language differences between the source text and the suspicious text.…”
Section: Text Plagiarismmentioning
confidence: 99%
“…Verbs in English source texts and suspicious Arabic texts can be identified using SRL. Apart from that, BabelNet can also provide a list of verbs that are similar to Arabic and English, which then calculates the similarity vector between two predicates from different languages using the wup similarity metric [61].…”
Section: Text Plagiarismmentioning
confidence: 99%
“…It consists of 547 aligned passages from 58,911 pairs from the United Nations Parallel Corpora [32], the OPUS collection of translated texts from the web [33] and King Saud University corpus [34]. We used another corpus prepared by [35] which has roughly 2085 of paraphrased translated pairs which will be used when evaluating only paraphrasing cases.…”
Section: Construction and Properties Of The Corpusmentioning
confidence: 99%
“…Deep learning models, such as neural machine translation (NMT) and transformer-based architectures, have played a significant role in enhancing translation quality by capturing complex linguistic patterns and nuances [2]. Moreover, the integration of large-scale multilingual datasets and the application of techniques like transfer learning have further improved the performance of cross-language translation systems [3]. Additionally, the emergence of pre-trained language models, such as BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer), has paved the way for more contextually aware and semantically accurate translations [4].…”
Section: Introductionmentioning
confidence: 99%