Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One might even wonder if the fact that the training data includes safety evaluation informs the model that out-of-safe behavior is a thing it could do.

Kind of like telling a kid not to do something pre-emptively backfiring because they had never considered it before the warning.



Comments like yours makes the AI behave that way though, since it is literally reading our comments and tries to behave according to our expectations.

The AI doom will happen due to all the AI doomposters.


Yep! That's another phrasing of the same idea!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: