AI alignment and human reward

According to a prominent approach to AI alignment, AI agents should be built to learn and promote human values. However, humans value things in several different ways: we have desires and preferences of various kinds, and if we engage in reinforcement learning, we also have reward functions. One res...

Full description

Bibliographic Details
Main Author:	Butlin, P
Format:	Conference item
Language:	English
Published:	Association for Computing Machinery 2021

AI alignment and human reward

Similar Items