2020
DOI: 10.3390/philosophies5040040
|View full text |Cite
|
Sign up to set email alerts
|

An AGI Modifying Its Utility Function in Violation of the Strong Orthogonality Thesis

Abstract: An artificial general intelligence (AGI) might have an instrumental drive to modify its utility function to improve its ability to cooperate, bargain, promise, threaten, and resist and engage in blackmail. Such an AGI would necessarily have a utility function that was at least partially observable and that was influenced by how other agents chose to interact with it. This instrumental drive would conflict with the strong orthogonality thesis since the modifications would be influenced by the AGI’s intelligence… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(9 citation statements)
references
References 12 publications
0
9
0
Order By: Relevance
“…History is replete with examples of 'power corrupts and absolute power corrupts absolutely' as well as singleton coercive government going wrong. The methods presented here are envisioned in an AGI society with specialized labor and voluntary, negotiated interactions that include permission to alter societal shared values that impact individual utility functions in order to prevent utility function changes from threatening present and future societies [10,14].…”
Section: 'Hard Take-off' and Automated Agi Governmentmentioning
confidence: 99%
See 3 more Smart Citations
“…History is replete with examples of 'power corrupts and absolute power corrupts absolutely' as well as singleton coercive government going wrong. The methods presented here are envisioned in an AGI society with specialized labor and voluntary, negotiated interactions that include permission to alter societal shared values that impact individual utility functions in order to prevent utility function changes from threatening present and future societies [10,14].…”
Section: 'Hard Take-off' and Automated Agi Governmentmentioning
confidence: 99%
“…As just alluded, a single heuristic, such as 'terminate all humans', or ethic, such as 'terminate all agents using resources inefficiently as defined by the following metric', added to a BCS could result in realization of the AGI existential threat, as could universal drives causing AGI to alter its utility function [14]. Thus, any alteration, especially forgery, of ethics modules or BCS must be detected.…”
Section: Detection Of Behavior Control System (Bcs) Forgery Via Acyclic Graphsmentioning
confidence: 99%
See 2 more Smart Citations
“…As just alluded, a single heuristic, such as 'terminate all humans', or ethic, such as 'terminate all agents using resources inefficiently as defined by the following metric', added to a BCS could result in realization of the AGI existential threat, as could universal drives, such as simply wanting to improve its ability to achieve goals, causing AGI to alter its utility function [50]. Thus, any alteration, especially forgery, of ethics modules or BCS must be detected.…”
Section: Detection Of Behavior Control System (Bcs) Forgery Via Acyclic Graphsmentioning
confidence: 99%