*We should think more about the Human alignment problem.* Absolutely this The po...

atleta · on May 13, 2022

I wouldn't say the latter is hypothetical, or at least unlikely. We know from experience that complex systems tend to behave in unexpected ways. In other words, the complex systems we build usually end up having surprising failure modes, we don't get them right the first time. It's enough to think about basically any software written by anyone. But it's not just software.

I've just watched a video on YT about nuclear weapons, which included their history. The second ever thermonuclear weapon experiment (with a new fuel type) ended up with 2.5x the yield predicted, because there was a then unknown reaction that created additional fusion fuel during the explosion. [1]

[1] https://en.wikipedia.org/wiki/Castle_Bravo

joe_the_user · on May 13, 2022

"In other words, the complex systems we build usually end up having surprising failure modes

But those are "failure modes", not "suddenly become something completely different" modes. And the key thing my parent pointed out is that modern AIs may be very impressive and stepping towards what we'd see as intelligence but they're actually further from the approach of "just give a goal and it will find it" schemes - they need laborious, large scale training to learn goals and goal-sets and even then they're far from reliable.

rgavuliak · on May 13, 2022

>In other words, the complex systems we build usually end up having surprising failure modes, we don't get them right the first time. It's enough to think about basically any software written by anyone. But it's not just software.

That is true, but how often does a bug actually improve a system or make it inefficient? Isn't the unexpected usually a degradation to the system?

atleta · on May 26, 2022

It depends on how you define "improve". I wouldn't call a runaway AI an improvement - from the users' perspective. E.g. if you think about the Chernobyl power plant accident, when they tried to stop the reactor by lowering the moderator rods, due to their design, it would transiently increase the power generated by the core. And this, in that case proved fatal, as it overheated and the moderator rods got stuck in a position where they continued to improve the efficiency of the core.

And you could say that it improved the efficiency of the system (it definitely improved the power output of the core) but as it was an unintended change, it really lead to a fatal degradation. And this is far from being the only example of a runaway process in the history of engineering.

pyinstallwoes · on May 14, 2022

See every patch ever in a game. Especially competitive or mmorpg. Exploiters love bugs!

kingcharles · on May 13, 2022

It doesn't need to be intentionally engineered. Humans are very creative and can find ways around systemic limits. There is that old adage which says something like "a hacker only needs to be right once, while the defenders have to be right 100% of the time."