The paper describes the results of comparison of two nonparametric methods of authorship identification in English literature. It describes testing methods with and without clustering. A method was also proposed to select the n-grams that would best serve as a marker to identify the author. More than 800 texts of 16 authors were used for testing. The method using the density of the distribution is suitable for identifying authors of both large texts (50000+ characters) and small (10000+ characters) ones. A method that uses p-statistics is only suitable for large texts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.