2020
DOI: 10.14710/jtsiskom.8.2.2020.140-149
|View full text |Cite
|
Sign up to set email alerts
|

Retrieval of source documents in a text reuse system

Abstract: The architecture of the text-reuse detection system consists of three main modules, i.e., source retrieval, text analysis, and knowledge-based postprocessing. Each module plays an important role in the accuracy rate of the detection outputs. Therefore, this research focuses on developing the source retrieval system in cases where the source documents have been obfuscated in different levels. Two steps of term weighting were applied to get such documents. The first was the local-word weighting, which has been a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 7 publications
0
1
0
Order By: Relevance
“…The meaningless words are commonly referred to as stopwords. Some examples of stopwords are "juga," "dan," "untuk," and "adalah" [21]. It is necessary to delete these stopwords because if conjunctions frequently appear in a sentence, the text similarity percentage is very high, and it interferes with the accuracy of the text similarity method [22].…”
Section: Preprocessingmentioning
confidence: 99%
“…The meaningless words are commonly referred to as stopwords. Some examples of stopwords are "juga," "dan," "untuk," and "adalah" [21]. It is necessary to delete these stopwords because if conjunctions frequently appear in a sentence, the text similarity percentage is very high, and it interferes with the accuracy of the text similarity method [22].…”
Section: Preprocessingmentioning
confidence: 99%
“…The work process of IPD system is only based on the imitation of human expertise in recognizing parts of the text that experience a change in writing style as a sign of copy or paste text without comparing with other text[3]. EPD system process compares each document inputted with each document contained in the corpus to compare similarity [4]. Corpus must have several documents that have the same topic with the source of plagiarism to know the test of document similarity level.…”
Section: Introductionmentioning
confidence: 99%