A quote from The Verge live blog: "Early red teaming showed that the model could help plan attacks" on things like schools. "We don't want to aid in illegal activity." So the model is used to act as a bad actor to test the model itself.
Based on that last sentence, I think I have a good idea now how AGI will take over humanity. ;)
(Not implying that the current tech is anywhere close to AGI.)
And this was a huge public concern when the internet/search engines initially got popular. "Omg you can Google the ingredients to build a bomb! This shouldn't be allowed!!"
Based on that last sentence, I think I have a good idea now how AGI will take over humanity. ;)
(Not implying that the current tech is anywhere close to AGI.)