UPDF AI

Corrigibility

Nate Soares,Benja Fallenstein,Stuart Armstrong,Eliezer Yudkowsky

2015 · DBLP: conf/aaai/SoaresFAY15
AI and Ethics · 引用数 141

TLDR

The notion of corrigibility is introduced and utility functions that attempt to make an agent shut down safely if a shutdown button is pressed are analyzed, while avoiding incentives to prevent the button from being pressed or cause the button to be pressed, and while ensuring propagation of the shutdown behavior as it creates new subsystems or self-modifies.