This paper presents the second release of arrau, a multigenre corpus of anaphoric information created over 10 years to provide data for the next generation of coreference/anaphora resolution systems combining different types of linguistic and world knowledge with advanced discourse modeling supporting rich linguistic annotations. The distinguishing features of arrau include the following: treating all NPs as markables, including non-referring NPs, and annotating their (non-) referentiality status; distinguishing between several categories of non-referentiality and annotating non-anaphoric mentions; thorough annotation of markable boundaries (minimal/maximal spans, discontinuous markables); annotating a variety of mention attributes, ranging from morphosyntactic parameters to semantic category; annotating the genericity status of mentions; annotating a wide range of anaphoric relations, including bridging relations and discourse deixis; and, finally, annotating anaphoric ambiguity. The current version of the dataset contains 350K tokens and is publicly available from LDC. In this paper, we discuss in detail all the distinguishing features of the corpus, so far only partially presented in a number of conference and workshop papers, and we also discuss the development between the first release of arrau in 2008 and this second one.
Emotions are signaled by complex arrays of face and body actions. The main point of contention in contemporary treatments is whether these arrays are discrete, holistic constellations reflecting emotion categories, or whether they are compositional—comprised of smaller components, each of which contributes some aspect of emotion to the complex whole. We address this question by investigating spontaneous face and body displays of athletes and place it in the wider context of human communicative signals and, in particular, of language. A defining property of human language is compositionality—the ability to combine and recombine a relatively small number of elements to create a vast number of complex meaningful expressions, and to interpret them. We ask whether this property of language can be discerned in a more ancient communicative system: intense emotional displays. In an experiment, participants interpreted a range of emotions and their strengths in pictures of athletes who had just won or lost a competition. By matching participants’ judgements with minutely coded features of face and body, we find evidence for compositionality. The distribution of participants’ responses indicates that most of the athletes’ face and body features contribute to displays of dominance or submission. More particular emotional components related, for example, to positive valence (e.g. happy) or goal obstruction (e.g. frustrated), were also found to significantly correlate with certain face and body features. We propose that the combination of features linked to broader components (i.e, dominant or submissive) and to more particular emotions (e.g, happy or frustrated) reflects more complex emotional states. In sum, we find that the corporeal expression of intense, unfiltered emotion has compositional properties, potentially providing an ancient scaffolding upon which, millions of years later, the abstract and constrained compositional system of human language could build.
Irony has been studied by famous scholars across centuries, as well as more recently in cognitive and pragmatic research. The prosodic and visual signals of irony were also studied. Irony is a communicative act in which the Sender's literal goal is to communicate a meaning x, but through this meaning the Sender has the goal to communicate another meaning, y, which is contrasting, sometimes even opposite, to meaning x. In this case we have an antiphrastic irony. So an ironic act is an indirect speech act, in that its true meaning, the one really intended by the Sender, is not the one communicated by the literal meaning of the communicative act: it must be understood through inferences by the Addressee. The ironic statement may concern an event, object or person, and in this case, the Addressee, or a third person, or even the Sender itself (Self-irony). In this paper we define irony in terms of a goal and belief view of communication, and show how the annotation scheme, the Anvil-Score, and illustrate aspects of its expressive power by applying it to a particular case: ironic communication in a judicial debate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.