Sarcasm production and comprehension have been traditionally described in terms of pragmatic factors. Lexical cues have received less attention, but they may be important potential indicators. A major obstacle to examining such features is determining sarcastic intent. One solution is to analyze statements explicitly marked as being sarcastic. This study examined Twitter postings marked with #sarcasm as well as dialog from Google Books containing the phrase "said sarcastically." We used word counting and part-of-speech tagging to compare specific lexical features of the explicitly-marked sarcastic statements to statements by the same author not marked as sarcastic. Our results broadly support the Lexical Cues Hypothesis-certain word-level cues, such as interjections and positive affect terms, are stereotypic of sarcasm. A model incorporating these features performed comparably to human raters in making sarcastic versus nonsarcastic judgments. This finding shows promise for future work toward automatically identifying sarcasm in text.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.