Adversarial Prompt Transformation for Systematic Jailbreaks of LLMs

The rapid integration of Large Language Models (LLMs) like OpenAI’s GPT series into diverse sectors has significantly enhanced digital interactions but also introduced new security challenges, notably the risk of "jailbreaking" where inputs cause models to deviate from their operational gu...

Full description

Bibliographic Details
Main Author: Awoufack, Kevin E.
Other Authors: Kagal, Lalana
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/157167