In this paper, we present a study for the identification of authors' national variety of English in texts from social media. In data from Facebook and Twitter, information about the author's social profile is annotated, and the national English variety (US, UK, AUS, CAN, NNS) that each author uses is attributed. We tested four feature types: formal linguistic features, POS features, lexicon-based features related to the different varieties, and databased features from each English variety. We used various machine learning algorithms for the classification experiments, and we implemented a feature selection process. The classification accuracy achieved, when the 31 highest ranked features were used, was up to 77.32%. The experimental results are evaluated, and the efficacy of the ranked features discussed.
In this paper, we present a study for the identification of stancerelated features in text data from social media. Based on our previous work on stance and our findings on stance patterns, we detected stance-related characteristics in a data set from Twitter and Facebook. We extracted various corpus-, quantitative-and computational-based features that proved to be significant for six stance categories (contrariety, hypotheticality, necessity, prediction, source of knowledge, and uncertainty), and we tested them in our data set. The results of a preliminary clustering method are presented and discussed as a starting point for future contributions in the field. The results of our experiments showed a strong correlation between different characteristics and stance constructions, which can lead us to a methodology for automatic stance annotation of these data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.