People who are worried about this should be even more upset about OpenAI than everyone else is.
What we've discovered is:
- LLMs are really hard to align and are prone to random off-script behavior.
- LLMs are extremely difficult to map and understand.
- LLMs are extremely vulnerable to prompt injection and instruction tampering on multiple axes, and nobody really knows how to solve that problem. It is trivially easy to get an LLM to go off-script and ignore previous instructions.
- LLMs are prone to "hallucinating" information (sometimes even after they've been presented with correct information), and when they're not aligned well they'll even argue with users about those hallucinations and then threaten them and berate them for disagreeing.
- LLM prompts often have strange results that are unpredictable.
- OpenAI has bad security practices.
- The intersection of AI and Capitalism has led to a huge explosion in hype with very little (if any) regard to safety mechanisms and best practices.
- Every company and their dog are all getting into launching bigger and bigger models and wiring them into more and more systems, and the only concern any of them have is who will be the first to market?
----
So if you're worried about a superintelligence, then:
- You probably shouldn't want that superintelligence to be an LLM at all. You should probably want people to be researching completely separate training techniques that are easier to align. It would be better if that superintelligence is not built on a foundation that is so volatile and unpredictable and hard to reason with.
- You probably shouldn't want OpenAI to build it, whatever it ends up being.
- You probably shouldn't want Silicon Valley to build it either, because the entire culture is based around rushing products to market and disregarding safety.
- You probably shouldn't want it trained on random globs of Internet data.
- Basically, you should be terrified of who is currently building those AIs, how they're being built, and why they're being built.
----
I'm not personally worried about OpenAI inventing a superintelligence; I think a lot of people are doing a lot of anthropomorphism right now and this increasingly smells like a hype bubble to me.
But if I was worried about OpenAI inventing a superintelligence, I would be criticizing the company's security and bad practices and reckless rushing to market even harder than I already am right now. OpenAI would be an absolutely horrible company to entrust with an AGI. So would Google/Facebook/Microsoft/Apple. If I actually believed that those companies had the potential to literally end humanity, I would be doing everything in my power to make their initial AI products fail miserably and to crash the AI market.
If you're fearful about the existential risks of AI, you should consider the people complaining about the current-day less theoretical risks as if they're allies, not out-of-touch enemies. All of the current-day risks should be treated as alarm bells signaling larger potential risks in the context of a superintelligence.
What we've discovered is:
- LLMs are really hard to align and are prone to random off-script behavior.
- LLMs are extremely difficult to map and understand.
- LLMs are extremely vulnerable to prompt injection and instruction tampering on multiple axes, and nobody really knows how to solve that problem. It is trivially easy to get an LLM to go off-script and ignore previous instructions.
- LLMs are prone to "hallucinating" information (sometimes even after they've been presented with correct information), and when they're not aligned well they'll even argue with users about those hallucinations and then threaten them and berate them for disagreeing.
- LLM prompts often have strange results that are unpredictable.
- OpenAI has bad security practices.
- The intersection of AI and Capitalism has led to a huge explosion in hype with very little (if any) regard to safety mechanisms and best practices.
- Every company and their dog are all getting into launching bigger and bigger models and wiring them into more and more systems, and the only concern any of them have is who will be the first to market?
----
So if you're worried about a superintelligence, then:
- You probably shouldn't want that superintelligence to be an LLM at all. You should probably want people to be researching completely separate training techniques that are easier to align. It would be better if that superintelligence is not built on a foundation that is so volatile and unpredictable and hard to reason with.
- You probably shouldn't want OpenAI to build it, whatever it ends up being.
- You probably shouldn't want Silicon Valley to build it either, because the entire culture is based around rushing products to market and disregarding safety.
- You probably shouldn't want it trained on random globs of Internet data.
- Basically, you should be terrified of who is currently building those AIs, how they're being built, and why they're being built.
----
I'm not personally worried about OpenAI inventing a superintelligence; I think a lot of people are doing a lot of anthropomorphism right now and this increasingly smells like a hype bubble to me.
But if I was worried about OpenAI inventing a superintelligence, I would be criticizing the company's security and bad practices and reckless rushing to market even harder than I already am right now. OpenAI would be an absolutely horrible company to entrust with an AGI. So would Google/Facebook/Microsoft/Apple. If I actually believed that those companies had the potential to literally end humanity, I would be doing everything in my power to make their initial AI products fail miserably and to crash the AI market.
If you're fearful about the existential risks of AI, you should consider the people complaining about the current-day less theoretical risks as if they're allies, not out-of-touch enemies. All of the current-day risks should be treated as alarm bells signaling larger potential risks in the context of a superintelligence.