Recent work by Clark et al. (2020) shows that transformers can act as "soft theorem provers" by answering questions over explicitly provided knowledge in natural language. In our work, we take a step closer to emulating formal theorem provers, by proposing PROVER, an interpretable transformer-based model that jointly answers binary questions over rule-bases and generates the corresponding proofs. Our model learns to predict nodes and edges corresponding to proof graphs in an efficient constrained training paradigm. During inference, a valid proof, satisfying a set of global constraints is generated. We conduct experiments on synthetic, hand-authored, and human-paraphrased rule-bases to show promising results for QA and proof generation, with strong generalization performance. First, PROVER generates proofs with an accuracy of 87%, while retaining or improving performance on the QA task, compared to RuleTakers (up to 6% improvement on zero-shot evaluation). Second, when trained on questions requiring lower depths of reasoning, it generalizes significantly better to higher depths (up to 15% improvement). Third, PROVER obtains near perfect QA accuracy of 98% using only 40% of the training data. However, generating proofs for questions requiring higher depths of reasoning becomes challenging, and the accuracy drops to 65% for "depth 5", indicating significant scope for future work. 1
Natural language constitutes a predominant medium for much of human learning and pedagogy. We consider the problem of concept learning from natural language explanations, and a small number of labeled examples of the concept. For example, in learning the concept of a phishing email, one might say 'this is a phishing email because it asks for your bank account number'. Solving this problem involves both learning to interpret open-ended natural language statements, as well as learning the concept itself. We present a joint model for (1) language interpretation (semantic parsing) and (2) concept learning (classification) that does not require labeling statements with logical forms. Instead, the model prefers discriminative interpretations of statements in context of observable features of the data as a weak signal for parsing. On a dataset of email-related concepts, this approach yields across-theboard improvements in classification performance, with a 30% relative improvement in F1 score over competitive classification methods in the low data regime.
Summary Software defined networking emerges as a promising paradigm shift that decouples the control plane from the data plane. It has the ability to centrally monitor and control the network through softwarization, ie, controller. Deploying a single controller is inefficient to handle large network traffic; thereby, making multiple controllers are a necessity of current software defined networking in wide area networks. Placing multiple controllers in an optimum way, ie, controller placement is a vibrant research problem. Controller placement problem (CPP) is of twofold: the minimum number of controllers to be placed in a network and locations of these controllers. Numerous researchers in the last 5 years (2012 to November 2017) have proposed solutions for the CPP, which is an NP‐hard problem. In general, solutions are based on objective functions and their optimum solutions considering various factors (such as propagation latency between switches and controllers, and intercontrollers) and constraints (such as the capacity of controllers and switches). To the best of our knowledge, this is the first attempt of state‐of‐the‐art review on CPP. This paper classifies the CPP, critically analyzes the existing solutions, and finds limitations and future scope existing, which will help potential researchers in this area to innovate new solutions for CPP lying on this information.
Humans can efficiently learn new concepts using language. We present a framework through which a set of explanations of a concept can be used to learn a classifier without access to any labeled examples. We use semantic parsing to map explanations to probabilistic assertions grounded in latent class labels and observed attributes of unlabeled data, and leverage the differential semantics of linguistic quantifiers (e.g., 'usually' vs 'always') to drive model training. Experiments on three domains show that the learned classifiers outperform previous approaches for learning with limited data, and are comparable with fully supervised classifiers trained from a small number of labeled examples.
Recently, pre-trained language models (LMs) have achieved strong performance when finetuned on difficult benchmarks like Super-GLUE. However, performance can suffer when there are very few labeled examples available for fine-tuning. Pattern Exploiting Training (PET) is a recent approach that leverages patterns for few-shot learning. However, PET uses task-specific unlabeled data. In this paper, we focus on few shot learning without any unlabeled data and introduce ADAPET, which modifies PET's objective to provide denser supervision during fine-tuning. As a result, ADAPET outperforms PET on Su-perGLUE without any task-specific unlabeled data. Our code can be found at https:// github.com/rrmenon10/ADAPET.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.