As part of the ongoing project, Project Halo, our goal is to build a system capable of answering questions posed by novice users to a formal knowledge base. In our current context, the knowledge base covers selected topics in physics, chemistry, and biology, and our question set consists of AP (advanced high-school) level examination questions. The task is challenging because the questions are linguistically complex and are often incomplete (assume unstated knowledge), and because the users do not have prior knowledge of the system's contents. Our solution involves two parts: a controlled language interface, in which users reformulate the original natural language questions in a simplified version of English, and a novel problem solver that can elaborate initially inadequate logical interpretations of a question by selecting relevant pieces of knowledge in the knowledge base. An evaluation of the work in 2006 showed that this approach is feasible and that complex, multisentence questions can be posed and answered, thus illustrating novel ways of dealing with the knowledge capture impedance between users and a formal knowledge base, while also revealing challenges that still remain.
Semantic relationships among words and phrases are often marked by explicit syntactic or lexical clues that help recognize such relationships in texts. Within complex nominals, however, few overt clues are available. Systems that analyze such nominals must compensate for the lack of surface clues with other information. One way is to load the system with lexical semantics for nouns or adjectives. This merely shifts the problem elsewhere: how do we define the lexical semantics and build large semantic lexicons? Another way is to find constructions similar to a given complex nominal, for which the relationships are already known. This is the way we chose, but it too has drawbacks. Similarity is not easily assessed, similar analyzed constructions may not exist, and if they do exist, their analysis may not be appropriate for the current nominal. We present a semi-automatic system that identifies semantic relationships in noun phrases without using precoded noun or adjective semantics. Instead, partial matching on previously analyzed noun phrases leads to a tentative interpretation of a new input. Processing can start without prior analyses, but the early stage requires user interaction. As more noun phrases are analyzed, the system learns to find better interpretations and reduces its reliance on the user. In experiments on English technical texts the system correctly identified 60-70% of relationships automatically.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.