Authorship Attribution of Social Media and Literary Russian-Language Texts Using Machine Learning Methods and Feature Selection

Fedotova, Anastasia; Romanov, Aleksandr; Kurtukova, Anna; Shelupanov, Alexander

doi:10.3390/fi14010004

Cited by 14 publications

(7 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Some scholars emphasize that the core of public opinion guidance lies in the competition for the right to speak, think that the main elements of the right to speak are the right to speak, the right to spread, and the right to guide, and put forward a new path of public opinion guidance. Some scholars pointed out that news and public opinion are an important carrier for a political party to master the right to speak [ 9 ]. The discourse power of news public opinion should obey certain political needs and guide the trend of news public opinion.…”

Section: Literature Reviewmentioning

confidence: 99%

[Retracted] Analysis of Sino‐Russian Media Cooperation and the Construction of International Online Public Opinion Discourse under the Dual Influence of Ecological and Online Environments

Ju-xi

2022

Journal of Environmental and Public Health

View full text Add to dashboard Cite

In the context of traditional media, the media can dominate the topic and composition of public opinion, but in the context of the convergence of online media, the dominance of public opinion has gradually evolved from the direction of the coexistence of single items and pluralism, especially from the international level. In other words, the construction of the comprehensive strategic writing partnership between China and Russia in the new era has given a new orientation and connotation to the bilateral relations and cooperation in the new era. In the face of an increasingly complex international public opinion environment, media cooperation between China and Russia is crucial. Therefore, based on an in-depth analysis of the construction of the discourse power of international public opinion under the dual influence of the ecological environment and the network environment, the construction of the discourse power of international public opinion is discussed from the aspects of the construction of discourse objects, the construction of the main body, and the enrichment of the content of international communication.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

[Retracted] Analysis of Sino‐Russian Media Cooperation and the Construction of International Online Public Opinion Discourse under the Dual Influence of Ecological and Online Environments

Ju-xi

2022

Journal of Environmental and Public Health

View full text Add to dashboard Cite

show abstract

“…• saliency maps: elements in the input that have the largest influence in the prediction are identified (e.g., LIME); • feature attribution: attributing the classification to a small number of numeric/semantic features [47,48]; • metric learning [49]: mapping out data structures by deriving a metric from a classifier (explicit Siamese networks are very popular); • activation maximization: methods that are based on GAN.…”

Section: State Of the Art In Explainable Aimentioning

confidence: 99%

Privacy-Preserving and Explainable AI in Industrial Applications

et al. 2022

View full text Add to dashboard Cite

The industrial environment has gone through the fourth revolution, also called “Industry 4.0”, where the main aspect is digitalization. Each device employed in an industrial process is connected to a network called the industrial Internet of things (IIOT). With IIOT manufacturers being capable of tracking every device, it has become easier to prevent or quickly solve failures. Specifically, the large amount of available data has allowed the use of artificial intelligence (AI) algorithms to improve industrial applications in many ways (e.g., failure detection, process optimization, and abnormality detection). Although data are abundant, their access has raised problems due to privacy concerns of manufacturers. Censoring sensitive information is not a desired approach because it negatively impacts the AI performance. To increase trust, there is also the need to understand how AI algorithms make choices, i.e., to no longer regard them as black boxes. This paper focuses on recent advancements related to the challenges mentioned above, discusses the industrial impact of proposed solutions, and identifies challenges for future research. It also presents examples related to privacy-preserving and explainable AI solutions, and comments on the interaction between the identified challenges in the conclusions.

show abstract

“…An important part of the literature consists of studies on English language [4,5,6,7,8]. There are also many studies done in many different languages including Japanese [9], Mongolian [10], Persian [11], Albanian [12], Indian [13,14], Brazilian [15], Russian [16,17], German [18], and Arabic [19]. When the existing studies were examined, it was seen that different types of data sets were used for author identification tasks.…”

Section: Literature Reviewmentioning

confidence: 99%

“…Some studies have been carried out on newspaper articles [4,15,18,19], while others were carried out on poems [13], novels [11,12,16], email content [20], song lyrics [21], source codes [22], or tweets, blog posts, and forums [8,9,23]. In some cases, different types of data sources were combined or compared [17,25] Early studies in author identification focused on different stylometric techniques. These techniques are based on identification of style markers including lexical and character features or syntactic and semantic features that quantify writing style [9,26].…”

Section: Literature Reviewmentioning

confidence: 99%

Author Identification with Machine Learning Algorithms

YÜLÜCE¹,

Dalkılıç²

2022

IJMSIT

View full text Add to dashboard Cite

Author identification is one of the application areas of text mining. It deals with the automatic prediction of the potential author of an electronic text among predefined author candidates by using author specific writing styles. In this study, we conducted an experiment for the identification of the author of a Turkish language text by using classical machine learning methods including Support Vector Machines (SVM), Gaussian Naive Bayes (GaussianNB), Multi Layer Perceptron (MLP), Logistic Regression (LR), Stochastic Gradient Descent (SGD) and ensemble learning methods including Extremely Randomized Trees (ExtraTrees), and eXtreme Gradient Boosting (XGBoost). The proposed method was applied on three different sizes of author groups including 10, 15 and 20 authors obtained from a new dataset of newspaper articles. Term frequency-inverse document frequency (TF-IDF) vectors were created by using 1-gram and 2-gram word tokens. Our results show that the most successful method is the SGD with a classification performance accuracy of 0.976% by using word unigrams and most successful method is the LR with a classification performance accuracy of 0.935% by using word bigrams.

show abstract

Authorship Attribution of Social Media and Literary Russian-Language Texts Using Machine Learning Methods and Feature Selection

Cited by 14 publications

References 39 publications

[Retracted] Analysis of Sino‐Russian Media Cooperation and the Construction of International Online Public Opinion Discourse under the Dual Influence of Ecological and Online Environments

[Retracted] Analysis of Sino‐Russian Media Cooperation and the Construction of International Online Public Opinion Discourse under the Dual Influence of Ecological and Online Environments

Privacy-Preserving and Explainable AI in Industrial Applications

Author Identification with Machine Learning Algorithms

Contact Info

Product

Resources

About