High quality annotated named entity corpora are essential language resources, but fully manual annotation is time-consuming. Interactive annotation offers an efficient alternative where humans and machines collaborate. Instances of named entity mentions tend to share the same label, when they cooccur in the same document and have similar surface forms. After selecting an instance in one sentence for manual annotation, the label of the instance can be propagated to instances in other sentences. This kind of document-level label propagation can be used to reduce human effort and improve annotation quality in interactive annotation. However, most existing literature assumes instances within different sentences are independent, and ignores document-level label propagation. This paper proposes a reinforcement learningbased approach, which learns to propagate labels among the instances within a document for interactive named entity annotation. In addition, our approach also learns instance selection for manual annotation. We optimize the objective which is a trade-off between human effort and annotation quality by training a deep Q-network. Our approach reduces human effort by more than 42% compared to baseline approaches, for achieving the same annotation quality (0.95 measured by F1 averaged on three datasets).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.