“…Reference resolution (RR), which is the task of resolving referring expressions (REs) to what they are intended to refer to, has been well-studied in various fields such as psychology (Isaacs and Clark, 1987;Tanenhaus and Spivey-Knowlton, 1995), linguistics (Pineda and Garza, 2000), as well as human/human (Iida et al, 2010) and human/machine interaction (Prasov and Chai, 2010;Siebert and Schlangen, 2008;. In recent years, multi-modal corpora have emerged which provide RR with important contextual information: collecting dialogue between two humans Spanger et al, 2012), between a human and a (simulated) dialogue system Liu et al, 2013), with gaze, information about the shared environment, and in some cases deixis.…”