AI alignment and human reward
According to a prominent approach to AI alignment, AI agents should be built to learn and promote human values. However, humans value things in several different ways: we have desires and preferences of various kinds, and if we engage in reinforcement learning, we also have reward functions. One res...
Main Author: | Butlin, P |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Association for Computing Machinery
2021
|
Similar Items
-
AI assertion
by: Butlin, P, et al.
Published: (2024) -
Transparent Value Alignment: Foundations for Human-Centered Explainable AI in Alignment
by: Sanneman, Lindsay
Published: (2023) -
Building Blocks for Human-AI Alignment: Specify, Inspect, Model, and Revise
by: Booth, Serena Lynn
Published: (2024) -
AI Ethics and Value Alignment for Nonhuman Animals
by: Soenke Ziesche
Published: (2021-04-01) -
Analyzing the Alignment between AI Curriculum and AI Textbooks through Text Mining
by: Hyeji Yang, et al.
Published: (2023-09-01)