Employing paraphrasing tools to conceal plagiarized text is a severe
threat to academic integrity. To enable the detection of
machine-paraphrased text, we evaluate the effectiveness of five
pre-trained word embedding models combined with machine learning
classifiers and state-of-the-art neural language models. We analyze
preprints of research papers , graduation theses, and Wikipedia
articles, which we paraphrased using different configurations of the
tools SpinBot and SpinnerChief. The best performing technique,
Longformer, achieved an average F1 score of 80.99% (F1=99.68% for
SpinBot and F1=71.64% for Spinner-Chief cases), while human evaluators
achieved F1=78.4% for SpinBot and F1=65.6% for SpinnerChief cases. We
show that the automated classification alleviates shortcomings of
widely-used text-matching systems , such as Turnitin and PlagScan. To
facilitate future research, all data 3 , code 4 , and two web
applications 56 showcasing our contributions are openly available.