We should think more about the Human alignment problem.
Absolutely this
The possibility of a thing being intentionally engineered by some humans to do things considered highly malevolent by other humans seems extremely likely and has actually been common through history.
The possibility of a thing just randomly acquiring an intention humans don't like and then doing things humans don't like is pretty hypothetical and it seems strictly less like than the first possibility.
I wouldn't say the latter is hypothetical, or at least unlikely. We know from experience that complex systems tend to behave in unexpected ways. In other words, the complex systems we build usually end up having surprising failure modes, we don't get them right the first time. It's enough to think about basically any software written by anyone. But it's not just software.
I've just watched a video on YT about nuclear weapons, which included their history. The second ever thermonuclear weapon experiment (with a new fuel type) ended up with 2.5x the yield predicted, because there was a then unknown reaction that created additional fusion fuel during the explosion. [1]
"In other words, the complex systems we build usually end up having surprising failure modes
But those are "failure modes", not "suddenly become something completely different" modes. And the key thing my parent pointed out is that modern AIs may be very impressive and stepping towards what we'd see as intelligence but they're actually further from the approach of "just give a goal and it will find it" schemes - they need laborious, large scale training to learn goals and goal-sets and even then they're far from reliable.
>In other words, the complex systems we build usually end up having surprising failure modes, we don't get them right the first time. It's enough to think about basically any software written by anyone. But it's not just software.
That is true, but how often does a bug actually improve a system or make it inefficient? Isn't the unexpected usually a degradation to the system?
It depends on how you define "improve". I wouldn't call a runaway AI an improvement - from the users' perspective. E.g. if you think about the Chernobyl power plant accident, when they tried to stop the reactor by lowering the moderator rods, due to their design, it would transiently increase the power generated by the core. And this, in that case proved fatal, as it overheated and the moderator rods got stuck in a position where they continued to improve the efficiency of the core.
And you could say that it improved the efficiency of the system (it definitely improved the power output of the core) but as it was an unintended change, it really lead to a fatal degradation. And this is far from being the only example of a runaway process in the history of engineering.
It doesn't need to be intentionally engineered. Humans are very creative and can find ways around systemic limits. There is that old adage which says something like "a hacker only needs to be right once, while the defenders have to be right 100% of the time."
Absolutely this
The possibility of a thing being intentionally engineered by some humans to do things considered highly malevolent by other humans seems extremely likely and has actually been common through history.
The possibility of a thing just randomly acquiring an intention humans don't like and then doing things humans don't like is pretty hypothetical and it seems strictly less like than the first possibility.