AI alignment and human reward
According to a prominent approach to AI alignment, AI agents should be built to learn and promote human values. However, humans value things in several different ways: we have desires and preferences of various kinds, and if we engage in reinforcement learning, we also have reward functions. One res...
Main Author: | |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Association for Computing Machinery
2021
|