Model merging and safety alignment: one bad model spoils the bunch
Merging Large Language Models (LLMs) is a cost-effective technique for combining multiple expert LLMs into a single versatile model, retaining the expertise of the original ones. However, current approaches often overlook the importance of safety alignment during merging, leading to highly misaligne...
Main Authors: | , , , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Association for Computational Linguistics
2024
|