Argument mining of online interactions is in its infancy. One reason is the lack of annotated corpora in this genre. To make progress, we need to develop a principled and scalable way of determining which portions of texts are argumentative and what is the nature of argumentation. We propose a two-tiered approach to achieve this goal and report on several initial studies to assess its potential.
Identifying the occurrences of proper names in text and the entities they refer to can be a difficult task because of the manyto-many mapping between names and their referents. We analyze the types of ambiguity --structural and semantic --that make the discovery of proper names difficult in text, and describe the heuristics used to disambiguate names in Nominator, a fully-implemented module for proper name recognition developed at the IBM T.J. Watson Research Center.
With the rapid development of social media, spontaneously user‐generated content such as tweets and forum posts have become important materials for tracking people's opinions and sentiments online. A major hurdle for current state‐of‐the‐art automatic methods for sentiment analysis is the fact that human communication often involves the use of sarcasm or irony, where the author means the opposite of what she/he says. Sarcasm transforms the polarity of an apparently positive or negative utterance into its opposite. Lack of naturally occurring utterances labeled for sarcasm is one of the key problems for the development of machine‐learning methods for sarcasm detection. We report on a method for constructing a corpus of sarcastic Twitter messages in which determination of the sarcasm of each message has been made by its author. We use this reliable corpus to compare sarcastic utterances in Twitter to utterances that express positive or negative attitudes without sarcasm. We investigate the impact of lexical and pragmatic factors on machine‐learning effectiveness for identifying sarcastic utterances and we compare the performance of machine‐learning techniques and human judges on this task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.