“…One of the most common studied tasks in NLP lies in extracting semantic information from unstructured text in the form of entities and detecting entity mentions across a single document, in particular where the mention is located (its span) and its corresponding classification or entity semantic type, such as person (PER), location (LOC), organization (ORG), etc. The task of entity recognition has long been studied and applied to different higher level tasks such as question answering (Abney et al, 2000), coreference resolution (Fragkou, 2017), relation extraction (Mintz et al, 2009;Miwa and Bansal, 2016;Liu et al, 2017), entity linking (Gupta et al, 2017;Guo and Barbosa, 2014) and event extraction (Feng et al, 2016). Most of the existing work in Named Entity Recognition and Classification focuses on flat mentions, usually corresponding to the longest outer mention (Ling and Weld, 2012;Marcinczuk, 2015;Leaman and Lu, 2016), or using nested mentions that can capture overlapping mentions within different nested levels (Finkel and Manning, 2009;Lu and Roth, 2015;Wang et al, 2018;Ju et al, 2018).…”