X-Risk Analysis for AI Research

Hendrycks, Dan; Mazeika, Mantas

doi:10.48550/arxiv.2206.05862

Cited by 5 publications

(5 citation statements)

References 22 publications

(29 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Ethical considerations. An ethical concern one might have about our work is that revealing BNSL might differentially (Hendrycks & Mazeika, 2022) improve A(G)I capabilities progress relative to A(G)I safety/alignment progress. A counter-argument is that BNSL will also allow the A(G)I safety/alignment field to extrapolate the scaling behaviors of its methods for aligning A(G)I systems and as a result will also accelerate alignment/safety progress.…”

Section: Discussionmentioning

confidence: 99%

Broken Neural Scaling Laws

Caballero¹,

Gupta²,

Rish³

et al. 2022

Preprint

View full text Add to dashboard Cite

We present a smoothly broken power law functional form that accurately models and extrapolates the scaling behaviors of deep neural networks (i.e. how the evaluation metric of interest varies as the amount of compute used for training, number of model parameters, training dataset size, or upstream performance varies) for each task within a large and diverse set of upstream and downstream tasks, in zero-shot, prompted, and fine-tuned settings. This set includes largescale vision and unsupervised language tasks, diffusion generative modeling of images, arithmetic, and reinforcement learning. When compared to other functional forms for neural scaling behavior, this functional form yields extrapolations of scaling behavior that are considerably more accurate (root mean squared log error of its extrapolations are 0.86 times that of previous state-of-the-art on average) on this set. Moreover, this functional form accurately models and extrapolates scaling behavior that other functional forms are incapable of expressing such as the non-monotonic transitions present in the scaling behavior of phenomena such as double descent and the delayed, sharp inflection points present in the scaling behavior of tasks such as arithmetic. Code is available at https: //github.com/ethancaballero/broken_neural_scaling_laws

show abstract

Section: Discussionmentioning

confidence: 99%

Broken Neural Scaling Laws

Caballero¹,

Gupta²,

Rish³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…If AI systems are persuasive enough, for instance, if they were to be more persuasive than 99% of humans, their unregulated proliferation could lead to serious degradation in discourse between humans [28]. If systems are persuasive, this could lead to a degradation of truth or potentially reduce trust between humans and machines.…”

Section: Ai-driven Persuasion Could Contribute To a Loss Of Human Con...mentioning

confidence: 99%

Artificial Influence: An Analysis Of AI-Driven Persuasion

Burtell¹,

Woodside²

2023

Preprint

View full text Add to dashboard Cite

Persuasion is a key aspect of what it means to be human, and is central to business, politics, and other endeavors. Advancements in artificial intelligence (AI) have produced AI systems that are capable of persuading humans to buy products, watch videos, click on search results, and more. Even systems that are not explicitly designed to persuade may do so in practice. In the future, increasingly anthropomorphic AI systems may form ongoing relationships with users, increasing their persuasive power. This paper investigates the uncertain future of persuasive AI systems. We examine ways that AI could qualitatively alter our relationship to and views regarding persuasion by shifting the balance of persuasive power, allowing personalized persuasion to be deployed at scale, powering misinformation campaigns, and changing the way humans can shape their own discourse. We consider ways AI-driven persuasion could differ from human-driven persuasion. We warn that ubiquitous highlypersuasive AI systems could alter our information environment so significantly so as to contribute to a loss of human control of our own future. In response, we examine several potential responses to AI-driven persuasion: prohibition, identification of AI agents, truthful AI, and legal remedies. We conclude that none of these solutions will be airtight, and that individuals and governments will need to take active steps to guard against the most pernicious effects of persuasive AI.

show abstract

“…11 Talking points include problems such as: Weaponization, where governments are strongly incentivized to weaponize AI, which would significantly increase the risks of conflict; 12 Enfeeblement, where important decisions may be handed off to AI, endangering humanity's capacity for selfgovernance. (This scenario was depicted in the film WALL-E); 13 Eroded epistemics, where nations, political parties, and many other actors are strongly incentivized to develop agents that spread propaganda, undermining our ability to seek truth; 14 Proxy gaming, where AI may strongly shape human behavior in suboptimal ways, illustrated by addiction caused by social media recommendation algorithms; 15 and Value lock-in, where advanced AI lock-in the dominance of the nations or companies that develop it, curtailing capacity for social progress. 16 We will discuss which practices should be implemented to ensure the development of beneficial AI.…”

Section: Part 3: the Difficulty Of Governing Ai [25 Min]mentioning

confidence: 99%

ETHICS-2023 Session E4 - Tutorial: AI Safety, governance, and alignment tutorial

Lynn Conklin,

Sett

2023

2023 IEEE International Symposium on Ethics in Engineering, Science, and Technology (ETHICS)

View full text Add to dashboard Cite

I. TOPIC OVERVIEWThe field of AI safety, governance, and alignment (SGA) is concerned with questions about how to integrate AI with human values. Topics in this area deal with (1) the transformative capabilities of AI within the global innovation helix; (2) the difficulty of controlling AI, esp. with regard to designing & setting goals, and interpreting & explaining AI behavior; and (3) the difficulty of governing AI, esp. with regard to establishing policies that ensure we develop beneficial AI and implementing practices across the broad domains of the global innovation helix to prevent catastrophic outcomes for humanity. With this in mind, one of the greatest tools that we have for obviating any risk to humanity is to educate technologists so that they can implement strategies during the design and testing phases of AI, prior to release. II. TUTORIAL OVERVIEWWe propose a 90 min. discussion-focused, workshop-tutorial with hands-on components on SGA, which falls into the general category of Innovation and Ethics Education. This tutorial has three parts. A. Part 1: The Transformative Capabilities of AI within the Global Innovation Helix [25 min.]Objectives: Attendees will be able frame the scope of the SGA problem, including the rapid development of AI, the transformative capabilities of AI, and the ethical issues that AI will likely create for humanity. They will also be able to articulate the ethical obligations, concerning SGA, that should guide individuals within and across different sectors of the global innovation helix.

show abstract

X-Risk Analysis for AI Research

Cited by 5 publications

References 22 publications

Broken Neural Scaling Laws

Broken Neural Scaling Laws

Artificial Influence: An Analysis Of AI-Driven Persuasion

ETHICS-2023 Session E4 - Tutorial: AI Safety, governance, and alignment tutorial

Contact Info

Product

Resources

About