Corrigibility
Corrigibility
Nate Soares,Benja Fallenstein,Stuart Armstrong,Eliezer Yudkowsky
2015 · DBLP: conf/aaai/SoaresFAY15
AI and Ethics · 引用数 141
TLDR
The notion of corrigibility is introduced and utility functions that attempt to make an agent shut down safely if a shutdown button is pressed are analyzed, while avoiding incentives to prevent the button from being pressed or cause the button to be pressed, and while ensuring propagation of the shutdown behavior as it creates new subsystems or self-modifies.
