“…Thanks in part to the latest developments in deep neural network architectures and contextual word embeddings (e.g., ELMo (Peters et al, 2018) and BERT (Devlin et al, 2019)), the performance of models for single-antecedent anaphora resolution has greatly improved (Wiseman et al, 2016;Clark and Manning, 2016b;Lee et al, 2017Kantor and Globerson, 2019;Joshi et al, 2020). So recently, the attention has turned to more complex cases of anaphora, such as anaphora requiring some sort of commonsense knowledge as in the Winograd Schema Challenge (Rahman and Ng, 2012;Peng et al, 2015;Liu et al, 2017;Sakaguchi et al, 2020); pronominal anaphors that cannot be resolved purely using gender (Webster et al, 2018), bridging reference (Hou, 2020;Yu and Poesio, 2020), discourse deixis (Kolhatkar and Hirst, 2014;Marasović et al, 2017; and, finally, split-antecedent anaphora (Zhou and Choi, 2018;Yu et al, 2020a) -plural anaphoric reference in which the two antecedents are not part of a single noun phrase.…”