The 41st International ACM SIGIR Conference on Research &Amp; Development in Information Retrieval 2018
DOI: 10.1145/3209978.3210177
|View full text |Cite
|
Sign up to set email alerts
|

HyPlag

Abstract: Current plagiarism detection systems reliably find instances of copied and moderately altered text, but often fail to detect strong paraphrases, translations, and the reuse of non-textual content and ideas. To improve upon the detection capabilities for such concealed content reuse in academic publications, we make four contributions: i) We present the first plagiarism detection approach that combines the analysis of mathematical expressions, images, citations and text. ii) We describe the implementation of th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 39 publications
(6 citation statements)
references
References 20 publications
0
6
0
Order By: Relevance
“…Approaches analyzing nontextual content features, such as academic citations [16,28], images [25,11], and mathematical content [26,29], complement the text analysis approaches to improve the detection of concealed plagiarism.…”
Section: Related Workmentioning
confidence: 99%
“…Approaches analyzing nontextual content features, such as academic citations [16,28], images [25,11], and mathematical content [26,29], complement the text analysis approaches to improve the detection of concealed plagiarism.…”
Section: Related Workmentioning
confidence: 99%
“…However, only a few studies have addressed the detection of plagiarism in digital mathematical libraries [19,21,22] regardless of the detection approach. We briefly describe the main findings of these studies hereafter.…”
Section: Background and Related Workmentioning
confidence: 99%
“…In a follow-up study, Meuschke et al (2019) introduced similarity measures that consider the order of mathematical identifiers and presented a two-stage retrieval process consisting of a candidate retrieval and a detailed analysis stage that replaced the exclusive use of pairwise document comparisons [22]. They implemented the process in the HyPlag prototype that also offers a user interface to investigate the identified similarities [21]. The candidate retrieval stage employs efficient index-based retrieval methods based on mathematical features.…”
Section: Background and Related Workmentioning
confidence: 99%
“…The exact matches are usually identified based on the features of character-based or word-based n-gram. The same character n-grams [5][6][7] or word n-grams [8][9][10][11][12] can be viewed as a pair of exact plagiarism match. The creating matches mainly rely on the features of text similarity.…”
Section: Related Workmentioning
confidence: 99%