2023
DOI: 10.1016/j.artint.2023.103963
|View full text |Cite
|
Sign up to set email alerts
|

Discovering agents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 32 publications
0
7
0
Order By: Relevance
“…If there is more, what would that conception look like? Kenton et al (2021) provide a more elaborate discussion, and they end up with a broad and encompassing conceptualisation of manipulation, arguing from a safety perspective: the more phenomena covered, the safer the resulting design. But, as they acknowledge themselves, their conceptualisation may be "too wide-ranging" (Kenton et al, 2021, p. 11).…”
Section: Design For Values and Conceptual Engineeringmentioning
confidence: 99%
See 2 more Smart Citations
“…If there is more, what would that conception look like? Kenton et al (2021) provide a more elaborate discussion, and they end up with a broad and encompassing conceptualisation of manipulation, arguing from a safety perspective: the more phenomena covered, the safer the resulting design. But, as they acknowledge themselves, their conceptualisation may be "too wide-ranging" (Kenton et al, 2021, p. 11).…”
Section: Design For Values and Conceptual Engineeringmentioning
confidence: 99%
“…However, disjunctive conceptions for identifying manipulation may be a solution. For example, in their discussion of the ethical alignment of language agents, Kenton et al (2021) reflect on the diversity of philosophical accounts of manipulation and opt for a disjunctive conception that combines several criteria that are discussed in the philosophical literature. Accordingly, they suggest that manipulation occurs by bypassing rationality, trickery, or pressure.…”
Section: Disjunctive Conceptions Of Manipulationmentioning
confidence: 99%
See 1 more Smart Citation
“…One reason is that goal‐directed AI agents are more likely to exhibit unpredictable or power‐seeking behaviour, which may make powerful systems especially dangerous (Bostrom, 2014; Omohundro, 2008). This thought has prompted interest in the nature of agency among AI safety researchers (e.g., Kenton et al, 2022). A second reason is that if artificial systems can be agents we will at some point confront problems of moral agency and responsibility in AI (Wallach & Allen, 2008).…”
Section: Introductionmentioning
confidence: 99%
“…Besides scaling, the most important one is the idea of alignment; i.e. creating an agent that behaves in accordance with what a human wants (Leike et al, 2018;Kenton et al, 2021). In the context of LLMs, this was first implemented and introduced in InstructGPT (Ouyang et al, 2022b), using an additional finetuning step on top of the original foundation model (GPT-3) via a combination of supervised and reinforcement learning with human feedback (RLHF).…”
Section: Scale Is All We Need?mentioning
confidence: 99%