The issue of authorship attribution has long been considered and continues to be a popular topic. Because of advances in digital computers, this field has experienced rapid developments in the last decade. In this article, a survey of recent advances in authorship attribution in text mining is presented. This survey focuses on authorship attribution methods that are statistically or computationally supported as opposed to traditional literary approaches. The main aspects covered include the changes in research topics over time, basic feature metrics, machine learning techniques, and the advantages and disadvantages of each approach. Moreover, the corpus size, number of candidates, data imbalance, and result description, all of which pose challenges in authorship attribution, are discussed to inform future work.This article is categorized under:
Statistical Learning and Exploratory Methods of the Data Sciences > Text Mining