The benefits, risks and bounds of personalizing the alignment of large language models to individuals
Large language models (LLMs) undergo ‘alignment’ so that they better reflect human values or preferences, and are safer or more useful. However, alignment is intrinsically difficult because the hundreds of millions of people who now interact with LLMs have different preferences for language and conv...
المؤلفون الرئيسيون: | Kirk, HR, Vidgen, B, Röttger, P, Hale, SA |
---|---|
التنسيق: | Journal article |
اللغة: | English |
منشور في: |
Springer Nature
2024
|
مواد مشابهة
-
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
حسب: Kirk, HR, وآخرون
منشور في: (2022) -
Is more data better? re-thinking the importance of efficiency in abusive language detection with transformers-based active learning
حسب: Kirk, HR, وآخرون
منشور في: (2022) -
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
حسب: Kirk, H, وآخرون
منشور في: (2021) -
Exploring large language models for ontology alignment
حسب: He, Y, وآخرون
منشور في: (2023) -
Survey on large language models alignment research
حسب: LIU Kunlin, وآخرون
منشور في: (2024-06-01)