Measuring an artificial intelligence language model’s trust in humans using machine incentives
Will advanced artificial intelligence (AI) language models exhibit trust toward humans? Gauging an AI model’s trust in humans is challenging because—absent costs for dishonesty—models might respond falsely about trusting humans. Accordingly, we devise a method for incentivizing machine decisions wit...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IOP Publishing
2024-01-01
|
Series: | Journal of Physics: Complexity |
Subjects: | |
Online Access: | https://doi.org/10.1088/2632-072X/ad1c69 |