Existing question answering (QA) datasets fail to train QA systems to perform complex reasoning and provide explanations for answers. We introduce HOTPOTQA, a new dataset with 113k Wikipedia-based question-answer pairs with four key features: (1) the questions require finding and reasoning over multiple supporting documents to answer; (2) the questions are diverse and not constrained to any pre-existing knowledge bases or knowledge schemas; (3) we provide sentence-level supporting facts required for reasoning, allowing QA systems to reason with strong supervision and explain the predictions; (4) we offer a new type of factoid comparison questions to test QA systems' ability to extract relevant facts and perform necessary comparison. We show that HOTPOTQA is challenging for the latest QA systems, and the supporting facts enable models to improve performance and make explainable predictions.
There are many applications in which it is desirable to order rather than
classify instances. Here we consider the problem of learning how to order
instances given feedback in the form of preference judgments, i.e., statements
to the effect that one instance should be ranked ahead of another. We outline a
two-stage approach in which one first learns by conventional means a binary
preference function indicating whether it is advisable to rank one instance
before another. Here we consider an on-line algorithm for learning preference
functions that is based on Freund and Schapire's 'Hedge' algorithm. In the
second stage, new instances are ordered so as to maximize agreement with the
learned preference function. We show that the problem of finding the ordering
that agrees best with a learned preference function is NP-complete.
Nevertheless, we describe simple greedy algorithms that are guaranteed to find
a good approximation. Finally, we show how metasearch can be formulated as an
ordering problem, and present experimental results on learning a combination of
'search experts', each of which is a domain-specific query expansion strategy
for a web search engine
Scientific literature with rich metadata can be represented as a labeled directed graph. This graph representation enables a number of scientific tasks such as ad hoc retrieval or named entity recognition (NER) to be formulated as typed proximity queries in the graph. One popular proximity measure is called Random Walk with Restart (RWR), and much work has been done on the supervised learning of RWR measures by associating each edge label with a parameter. In this paper, we describe a novel learnable proximity measure which instead uses one weight per edge label sequence: proximity is defined by a weighted combination of simple "path experts", each corresponding to following a particular sequence of labeled edges. Experiments on eight tasks in two subdomains of biology show that the new learning method significantly outperforms the RWR model (both trained and untrained). We also extend the method to support two additional types of experts to model intrinsic properties of entities: query-independent experts, which generalize the PageRank measure, and popular entity experts which allow rankings to be adjusted for particular entities that are especially important.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.