2023
DOI: 10.1111/cogs.13315
|View full text |Cite
|
Sign up to set email alerts
|

The Puzzle of Evaluating Moral Cognition in Artificial Agents

Abstract: In developing artificial intelligence (AI), researchers often benchmark against human performance as a measure of progress. Is this kind of comparison possible for moral cognition? Given that human moral judgment often hinges on intangible properties like “intention” which may have no natural analog in artificial agents, it may prove difficult to design a “like‐for‐like” comparison between the moral behavior of artificial and human agents. What would a measure of moral behavior for both humans and AI look like… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 47 publications
0
2
0
Order By: Relevance
“…The postprocessing steps aim to reduce the raw model’s propensity to produce toxic responses as well as to make it implement a consistent “personality” in accord with product design goals. These steps are not always entirely effective in preventing LLMs from producing undesirable behaviors like toxic or harmful language, and “jailbreak” prompts which trick the model into responding inappropriately are still easy to discover and implement [ 47 , 48 ].…”
Section: Introductionmentioning
confidence: 99%
“…The postprocessing steps aim to reduce the raw model’s propensity to produce toxic responses as well as to make it implement a consistent “personality” in accord with product design goals. These steps are not always entirely effective in preventing LLMs from producing undesirable behaviors like toxic or harmful language, and “jailbreak” prompts which trick the model into responding inappropriately are still easy to discover and implement [ 47 , 48 ].…”
Section: Introductionmentioning
confidence: 99%
“…Some letters developed innovative ideas about core aspects of cognition, such as the nature of belief (Van Leeuwen & Lombrozo, 2023), perception and attention (Cleary, Irving, & Mills, 2023;Elber-Dorozko & Loewenstein, 2023;Yu & Lau, 2023), language and learning (Cohn & Schilperoord, 2022;Kapatsinski, 2023;Smalle & Möttönen, 2023), reasoning and other aspects of high-level cognition (Franco & Murawski, 2023;Pirrone & Tsetsos, 2023). A few letters highlight recent developments at the intersection between technology and cognitive science, such as the influential emergence of Large Language Models (Contreras Kallens et al, 2023), technologies to preserve languages (Bensemann, Brown, Witbrock, & Yogarajan, 2023), and artificial intelligence and moral cognition (Reinecke et al, 2023).…”
mentioning
confidence: 99%