Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -from models with tens of millions of parameters up to a 280 billion parameter model called Gopher. These models are evaluated on 152 diverse tasks, achieving state-of-the-art performance across the majority. Gains from scale are largest in areas such as reading comprehension, fact-checking, and the identification of toxic language, but logical and mathematical reasoning see less benefit. We provide a holistic analysis of the training dataset and model's behaviour, covering the intersection of model scale with bias and toxicity. Finally we discuss the application of language models to AI safety and the mitigation of downstream harms.
Large language models (LM) generate remarkably fluent text and can be efficiently adapted across NLP tasks. Measuring and guaranteeing the quality of generated text in terms of safety is imperative for deploying LMs in the real world; to this end, prior work often relies on automatic evaluation of LM toxicity. We critically discuss this approach, evaluate several toxicity mitigation strategies with respect to both automatic and human evaluation, and analyze consequences of toxicity mitigation in terms of model bias and LM quality. We demonstrate that while basic intervention strategies can effectively optimize previously established automatic metrics on the REAL-TOXICITYPROMPTS dataset, this comes at the cost of reduced LM coverage for both texts about, and dialects of, marginalized groups. Additionally, we find that human raters often disagree with high automatic toxicity scores after strong toxicity reduction interventionshighlighting further the nuances involved in careful evaluation of LM toxicity. * Denotes equal contribution.
Abstract-This tutorial describes TuLiP, the Temporal Logic Planning toolbox, a collection of tools for designing controllers for hybrid systems from specifications in temporal logic. The tools support a workflow that starts from a description of desired behavior, and of the system to be controlled. The system can have discrete state, or be a hybrid dynamical system with a mixed discrete and continuous state space. The desired behavior can be represented with temporal logic and discrete transition systems. The system description can include uncontrollable variables that take discrete or continuous values, and represent disturbances and other environmental factors that affect the dynamics, as well as communication signals that affect controller decisions.A control design problem is solved in phases that involve abstraction, discrete synthesis, and continuous feedback control. Abstraction yields a discrete description of system dynamics in logic. For piecewise affine dynamical systems, this abstraction is constructed automatically, guided by the geometry of the dynamics and under logical constraints from the specification. The resulting logic formulae describe admissible discrete behaviors that capture both controlled and environment variables. The discrete description resulting from abstraction is then conjoined with the desired logic specification. To find a controller, the toolbox solves a game of infinite duration. Existence of a discrete (winning) strategy for the controlled variables in this game is a proof certificate for the existence of a controller for the original problem, which guarantees satisfaction of the specification. This discrete strategy, concretized by using continuous controllers, yields a feedback controller for the original hybrid system. The toolbox frontend is written in Python, with backends in C, Python, and Cython.The tutorial starts with an overview of the theory behind TuLiP, and of its software architecture, organized into specification frontends and backends that implement algorithms for abstraction, solving games, and interfaces to other tools. Then, the main elements for writing a specification for input to TuLiP are introduced. These include logic formulae, discrete transition systems annotated with predicates, and hybrid dynamical systems, with linear or piecewise affine continuous dynamics. The working principles of the algorithms for predicate abstraction and discrete game solving using nested fixpoints are explained, by following the input specification through the various transformations that compile it to a symbolic representation that scales well to solving large games. The tutorial concludes with several design examples that demonstrate the toolbox's capabilities.
There has been considerable progress made towards conversational models that generate coherent and fluent responses; however, this often involves training large language models on large dialogue datasets, such as Reddit. These large conversational models provide little control over the generated responses, and this control is further limited in the absence of annotated conversational datasets for attribute specific generation that can be used for fine-tuning the model. In this paper, we first propose and evaluate plug-and-play methods for controllable response generation, which does not require dialogue specific datasets and does not rely on fine-tuning a large model. While effective, the decoding procedure induces considerable computational overhead, rendering the conversational model unsuitable for interactive usage. To overcome this, we introduce an approach that does not require further computation at decoding time, while also does not require any fine-tuning of a large language model. We demonstrate, through extensive automatic and human evaluation, a high degree of control over the generated conversational responses with regard to multiple desired attributes, while being fluent. 1 * Equal Contribution †Work done primarily at the Caltech.
Large language models (LM) generate remarkably fluent text and can be efficiently adapted across NLP tasks. Measuring and guaranteeing the quality of generated text in terms of safety is imperative for deploying LMs in the real world; to this end, prior work often relies on automatic evaluation of LM toxicity. We critically discuss this approach, evaluate several toxicity mitigation strategies with respect to both automatic and human evaluation, and analyze consequences of toxicity mitigation in terms of model bias and LM quality. We demonstrate that while basic intervention strategies can effectively optimize previously established automatic metrics on the REAL-TOXICITYPROMPTS dataset, this comes at the cost of reduced LM coverage for both texts about, and dialects of, marginalized groups. Additionally, we find that human raters often disagree with high automatic toxicity scores after strong toxicity reduction interventionshighlighting further the nuances involved in careful evaluation of LM toxicity. * Denotes equal contribution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.