Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

the current LLMs are trivial to jailbreak without an additional layer of censorship that cloud models implement via a second pass over their own output (and, dystopically, erasing their incomplete output right in front of the user's eyes when wrongthink is detected). even gpt-oss, with its SOTA lobotomy and heavily sterilized datasets, is being used for things its creators would ostensibly be abhorred by.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: