We present a new kind of question answering dataset, OpenBookQA, modeled after open book exams for assessing human understanding of a subject. The open book that comes with our questions is a set of 1326 elementary level science facts. Roughly 6000 questions probe an understanding of these facts and their application to novel situations. This requires combining an open book fact (e.g., metals conduct electricity) with broad common knowledge (e.g., a suit of armor is made of metal) obtained from other sources. While existing QA datasets over documents or knowledge bases, being generally self-contained, focus on linguistic understanding, OpenBookQA probes a deeper understanding of both the topic-in the context of common knowledge-and the language it is expressed in. Human performance on OpenBookQA is close to 92%, but many state-of-the-art pre-trained QA methods perform surprisingly poorly, worse than several simple neural baselines we develop. Our oracle experiments designed to circumvent the knowledge retrieval bottleneck demonstrate the value of both the open book and additional facts. We leave it as a challenge to solve the retrieval problem in this multi-hop setting and to close the large gap to human performance. 1 The dataset and the code for the models are available at
Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero-and few-shot learning. Given their computational cost, these models are difficult to replicate without significant capital. For the few that are available through APIs, no access is granted to the full model weights, making them difficult to study. We present Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers. We show that OPT-175B is comparable to GPT-3, 1 while requiring only 1/7th the carbon footprint to develop. We are also releasing our logbook detailing the infrastructure challenges we faced, along with code for experimenting with all of the released models. * Equal contribution. † Work done while at Meta AI. 1 Following Brown et al. (2020), we use GPT-3 to refer to both the 175B model and the smaller scale models as well.2 Exceptions include work by EleutherAI, who released dense models up to 20B in size (Black et al., 2022), Salesforce (Nijkamp et al., 2022), and Meta AI, who released dense models up to 13B and sparse models up to 1. 1T (Artetxe et al., 2021). There is also ongoing work from the BigScience workshop (https://bigscience. huggingface.co/), which aims to open source very large multilingual language models and datasets.
We introduce a neural reading comprehension model that integrates external commonsense knowledge, encoded as a keyvalue memory, in a cloze-style setting. Instead of relying only on document-toquestion interaction or discrete features as in prior work, our model attends to relevant external knowledge and combines this knowledge with the context representation before inferring the answer. This allows the model to attract and imply knowledge from an external knowledge source that is not explicitly stated in the text, but that is relevant for inferring the answer. Our model improves results over a very strong baseline on a hard Common Nouns dataset, making it a strong competitor of much more complex models. By including knowledge explicitly, our model can also provide evidence about the background knowledge used in the RC process.
The emergence of user forums in electronic news media has given rise to the proliferation of opinion manipulation trolls. Finding such trolls automatically is a hard task, as there is no easy way to recognize or even to define what they are; this also makes it hard to get training and testing data. We solve this issue pragmatically: we assume that a user who is called a troll by several people is likely to be one. We experiment with different variations of this definition, and in each case we show that we can train a classifier to distinguish a likely troll from a non-troll with very high accuracy, 82-95%, thanks to our rich feature set.
There are different definitions of what a troll is. Certainly, a troll can be somebody who teases people to make them angry, or somebody who offends people, or somebody who wants to dominate any single discussion, or somebody who tries to manipulate people's opinion (sometimes for money), etc. The last definition is the one that dominates the public discourse in Bulgaria and Eastern Europe, and this is our focus in this paper.In our work, we examine two types of opinion manipulation trolls: paid trolls that have been revealed from leaked "reputation management contracts" and "mentioned trolls" that have been called such by several different people. We show that these definitions are sensible: we build two classifiers that can distinguish a post by such a paid troll from one by a non-troll with 81-82% accuracy; the same classifier achieves 81-82% accuracy on so called mentioned troll vs. non-troll posts. 1 The independent Bulgarian media Bivol published a leaked contract described the following services in favor of the government:"Monthly posting online of 250 comments by virtual users with varied, typical and evolving profiles from different (non-recurring) IP addresses to inform, promote, balance or counteract. The intensity of the provided online presence will be adequately distributed and will correspond to the political situation in the country." See
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.