The benefits, risks and bounds of personalizing the alignment of large language models to individuals
Large language models (LLMs) undergo ‘alignment’ so that they better reflect human values or preferences, and are safer or more useful. However, alignment is intrinsically difficult because the hundreds of millions of people who now interact with LLMs have different preferences for language and conv...
Asıl Yazarlar: | Kirk, HR, Vidgen, B, Röttger, P, Hale, SA |
---|---|
Materyal Türü: | Journal article |
Dil: | English |
Baskı/Yayın Bilgisi: |
Springer Nature
2024
|
Benzer Materyaller
-
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
Yazar:: Kirk, HR, ve diğerleri
Baskı/Yayın Bilgisi: (2022) -
Is more data better? re-thinking the importance of efficiency in abusive language detection with transformers-based active learning
Yazar:: Kirk, HR, ve diğerleri
Baskı/Yayın Bilgisi: (2022) -
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
Yazar:: Kirk, H, ve diğerleri
Baskı/Yayın Bilgisi: (2021) -
Exploring large language models for ontology alignment
Yazar:: He, Y, ve diğerleri
Baskı/Yayın Bilgisi: (2023) -
Survey on large language models alignment research
Yazar:: LIU Kunlin, ve diğerleri
Baskı/Yayın Bilgisi: (2024-06-01)