Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.208
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Detection of Machine Generated Text: A Critical Survey

Abstract: Text generative models (TGMs) excel in producing text that matches the style of human language reasonably well. Such TGMs can be misused by adversaries, e.g., by automatically generating fake news and fake product reviews that can look authentic and fool humans. Detectors that can distinguish text generated by TGM from human written text play a vital role in mitigating such misuse of TGMs. Recently, there has been a flurry of works from both natural language processing (NLP) and machine learning (ML) communiti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
48
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 73 publications
(49 citation statements)
references
References 45 publications
0
48
0
1
Order By: Relevance
“…Our work is also related to detecting machinegenerated and manipulated text (Jawahar et al, 2020;Nagoudi et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Our work is also related to detecting machinegenerated and manipulated text (Jawahar et al, 2020;Nagoudi et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Arabic tweets, which in turn use both Modern Standard Arabic and Dialectal Arabic (Abdul-Mageed et al, 2020).…”
Section: Marbert Which Is Trained On One Billionmentioning
confidence: 99%
“…Automatic Detection of Generated Text Given the potential malicious applications of text generation (Solaiman et al, 2019), it is thus vital to build detectors to distinguish text generated by machines from humans (Gehrmann et al, 2019;Jawahar et al, 2020;Varshney et al, 2020;Çano and Bojar, 2020). Most current work focus on fake news detection (Rashkin et al, 2017;Zhou et al, 2019;Bhat and Parthasarathy, 2020;Zhong et al, 2020;Schuster et al, 2020;Ippolito et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…The baselines and models are trained on texts from the GPT-2 small model and further used to detect texts generated by unseen GPT-style models with pure sampling: GPT-2 Medium (345M), GPT-2 Large (762M) and GPT-2 XL (1542M). Note that such a setting is the most challenging as it requires the transfer from the smallest model to that of the higher number of parameters (Jawahar et al, 2020).…”
Section: Robustness Towards Unseen Modelsmentioning
confidence: 99%