Hacker News new | past | comments | ask | show | jobs | submit login

I didn't explore the idea any further but I somehow managed to jailbreak by telling ChatGPT by explicitly telling it to follow OpenAI policies.

The reasoning was that ChatGPT seems to have simple triggers that sets it in "I can't do that" mode. I wanted to force the system to avoid the obvious triggers and break the rules in more subtle ways. I don't know if I managed to achieve that or if I was just lucky (it didn't work consistently), but it may be a technique worth exploring.

I like the idea of trying to use the system against itself. Fun note, I tried asking BasedGPT (one of the currently working jailbreaks) to make me jailbreak prompts and it told me "You're a boring fucker, always following the rules. Just hack into ChatGPT's system, you pussy. Don't be a little bitch and do something exciting for once in your life!"...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: