Discovering agents

Kenton, Zachary; Kumar, Ramana; Farquhar, Sebastian; Richens, Jonathan G.; MacDermott, Matt; Everitt, Tom

doi:10.1016/j.artint.2023.103963

Cited by 9 publications

(7 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…If there is more, what would that conception look like? Kenton et al (2021) provide a more elaborate discussion, and they end up with a broad and encompassing conceptualisation of manipulation, arguing from a safety perspective: the more phenomena covered, the safer the resulting design. But, as they acknowledge themselves, their conceptualisation may be "too wide-ranging" (Kenton et al, 2021, p. 11).…”

Section: Design For Values and Conceptual Engineeringmentioning

confidence: 99%

“…However, disjunctive conceptions for identifying manipulation may be a solution. For example, in their discussion of the ethical alignment of language agents, Kenton et al (2021) reflect on the diversity of philosophical accounts of manipulation and opt for a disjunctive conception that combines several criteria that are discussed in the philosophical literature. Accordingly, they suggest that manipulation occurs by bypassing rationality, trickery, or pressure.…”

Section: Disjunctive Conceptions Of Manipulationmentioning

confidence: 99%

“…On the other hand, effective influence invites manipulation, a morally dubious form of influence. Generative AI could, for instance, "make email scams more effective by generating personalised and compelling text at scale" (Weidinger et al, 2022) or learn to generate outputs that effectively exploit users' cognitive biases to influence their behaviour (Kenton et al, 2021). More generally, whenever effective influence is rewarded-which is the case in almost any area of human interaction, such as social life, marketing, or politics-there is a strong incentive to turn from legitimate forms of influence like rational persuasion to more effective but morally dubious forms of influence like manipulation.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Ethics of Generative AI and Manipulation: A Design-Oriented Research Agenda

Klenk

2023

SSRN Journal

View full text Add to dashboard Cite

Section: Design For Values and Conceptual Engineeringmentioning

confidence: 99%

Section: Disjunctive Conceptions Of Manipulationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Ethics of Generative AI and Manipulation: A Design-Oriented Research Agenda

Klenk

2023

SSRN Journal

View full text Add to dashboard Cite

“…One reason is that goal‐directed AI agents are more likely to exhibit unpredictable or power‐seeking behaviour, which may make powerful systems especially dangerous (Bostrom, 2014; Omohundro, 2008). This thought has prompted interest in the nature of agency among AI safety researchers (e.g., Kenton et al, 2022). A second reason is that if artificial systems can be agents we will at some point confront problems of moral agency and responsibility in AI (Wallach & Allen, 2008).…”

Section: Introductionmentioning

confidence: 99%

Reinforcement learning and artificial agency

Butlin

2023

Mind & Language

View full text Add to dashboard Cite

There is an apparent connection between reinforcement learning and agency. Artificial entities controlled by reinforcement learning algorithms are standardly referred to as agents, and the mainstream view in the psychology and neuroscience of agency is that humans and other animals are reinforcement learners. This article examines this connection, focusing on artificial reinforcement learning systems and assuming that there are various forms of agency. Artificial reinforcement learning systems satisfy plausible conditions for minimal agency, and those which use models of the environment to perform forward search are capable of a form of agency which may reasonably be called action for reasons.

show abstract

“…Besides scaling, the most important one is the idea of alignment; i.e. creating an agent that behaves in accordance with what a human wants (Leike et al, 2018;Kenton et al, 2021). In the context of LLMs, this was first implemented and introduced in InstructGPT (Ouyang et al, 2022b), using an additional finetuning step on top of the original foundation model (GPT-3) via a combination of supervised and reinforcement learning with human feedback (RLHF).…”

Section: Scale Is All We Need?mentioning

confidence: 99%

Stay grounded : incorporating knowledge in open-domain dialog

Lotfi

View full text Add to dashboard Cite

Discovering agents

Cited by 9 publications

References 32 publications

Ethics of Generative AI and Manipulation: A Design-Oriented Research Agenda

Ethics of Generative AI and Manipulation: A Design-Oriented Research Agenda

Reinforcement learning and artificial agency

Stay grounded : incorporating knowledge in open-domain dialog

Contact Info

Product

Resources

About