This paper describes the system that has been used by TeamX in SemEval-2014 Task 9 Subtask B. The system is a sentiment analyzer based on a supervised text categorization approach designed with following two concepts. Firstly, since lexicon features were shown to be effective in SemEval-2013 Task 2, various lexicons and pre-processors for them are introduced to enhance lexical information. Secondly, since a distribution of sentiment on tweets is known to be unbalanced, an weighting scheme is introduced to bias an output of a machine learner. For the test run, the system was tuned towards Twitter texts and successfully achieved high scoring results on Twitter data, average F 1 70.96 on Twitter2014 and average F 1 56.50 on Twitter2014Sarcasm.
Profile inference of SNS users is valuable for marketing, target advertisement, and opinion polls. Several studies examining profile inference have been reported to date. Although information of various types is included in SNS, most such studies only use text information. It is expected that incorporating information of other types into text classifiers can provide more accurate profile inference. As described in this paper, we propose combined method of text processing and image processing to improve gender inference accuracy. By applying the simple formula to combine two results derived from a text processor and an image processor, significantly increased accuracy was confirmed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.