In this paper we present a novel, customizable IE paradigm that takes advantage of predicate-argument structures. We also introduce a new way of automatically identifying predicate argument structures, which is central to our IE paradigm. It is based on: (1) an extended set of features; and (2) inductive decision tree learning. The experimental results prove our claim that accurate predicate-argument structures enable high quality IE results.
This paper presents two systems for textual entailment, both employing decision trees as a supervised learning algorithm. The first one is based primarily on the concept of lexical overlap, considering a bag of words similarity overlap measure to form a mapping of terms in the hypothesis to the source text. The second system is a lexicosemantic matching between the text and the hypothesis that attempts an alignment between chunks in the hypothesis and chunks in the text, and a representation of the text and hypothesis as two dependency graphs. Their performances are compared and their positive and negative aspects are analyzed.
Textual Question Answering (QA) identifies the answer to a question in large collections of on-line documents. By providing a small set of exact answers to questions, QA takes a step closer to information retrieval rather than document retrieval. A QA system comprises three modules: a question-processing module, a document-processing module, and an answer extraction and formulation module. Questions may be asked about any topic, in contrast with Information Extraction (IE), which identifies textual information relevant only to a predefined set of events and entities. The natural language processing (NLP) techniques used in open-domain QA systems may range from simple lexical and semantic disambiguation of question stems to complex processing that combines syntactic and semantic features of the questions with pragmatic information derived from the context of candidate answers. This article reviews current research in integrating knowledge-based NLP methods with shallow processing techniques for QA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.