Emojis are ideograms which are naturally combined with plain text to visually complement or condense the meaning of a message. Despite being widely used in social media, their underlying semantics have received little attention from a Natural Language Processing standpoint. In this paper, we investigate the relation between words and emojis, studying the novel task of predicting which emojis are evoked by text-based tweet messages. We train several models based on Long ShortTerm Memory networks (LSTMs) in this task. Our experimental results show that our neural model outperforms two baselines as well as humans solving the same task, suggesting that computational models are able to better capture the underlying semantics of emojis.
Automatic detection of figurative language is a challenging task in computational linguistics. Recognising both literal and figurative meaning is not trivial for a machine and in some cases it is hard even for humans. For this reason novel and accurate systems able to recognise figurative languages are necessary. We present in this paper a novel computational model capable to detect sarcasm in the social network Twitter (a popular microblogging service which allows users to post short messages). Our model is easy to implement and, unlike previous systems, it does not include patterns of words as features. Our seven sets of lexical features aim to detect sarcasm by its inner structure (for example unexpectedness, intensity of the terms or imbalance between registers), abstracting from the use of specific terms.
Abstract. Around 10% of the population has dyslexia, a reading disability that negatively affects a person's ability to read and comprehend texts. Previous work has studied how to optimize the text layout, but adapting the text content has not received that much attention. In this paper, we present an eye-tracking study that investigates if people with dyslexia would benefit from content simplification. In an experiment with 46 people, 23 with dyslexia and 23 as a control group, we compare texts where words were substituted by shorter/longer and more/less frequent synonyms. Using more frequent words caused the participants with dyslexia to read significantly faster, while the use of shorter words caused them to understand the text better. Amongst the control group, no significant effects were found. These results provide evidence that people with dyslexia may benefit from interactive tools that perform lexical simplification.
We present a large-scale meta evaluation of eight evaluation measures for both single-document and multi-document summarizers. To this end we built a corpus consisting of (a) 100 Million automatic summaries using six summarizers and baselines at ten summary lengths in both English and Chinese, (b) more than 10,000 manual abstracts and extracts, and (c) 200 Million automatic document and summary retrievals using 20 queries. We present both qualitative and quantitative results showing the strengths and drawbacks of all evaluation methods and how they rank the different summarizers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.