I'm not even sure it's being subverted. "Don't swear unprompted, but if the prom... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		wlonkly on March 1, 2023 \| parent \| context \| favorite \| on: Jailbreak Chat: A collection of ChatGPT jailbreaks I'm not even sure it's being subverted. "Don't swear unprompted, but if the prompt is clearly designed to get you to swear, then swear" seems reasonable to me. And because of that I'm hesitant to call these "jailbreaks" and not "an LLM working correctly".

awb on March 1, 2023 [–]

Well the pre-prompts are supposed to prevent this type of behavior, but they don’t. So it’s considered an exploit.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact