> It is clearly imaginable that a very intelligent agent could end humanity if i...

deadpannini · on June 18, 2022

"Mainly, why would it want to [end humanity]?"

The anthropomorphism is misleading. No one expects that an AGI would "want to" in the commonplace sense of being motivated by animosity, fear, or desire. The problem is that the best path to satisying its reward function could have adverse-to-extinction level consequences for humanity, because alignment is hard, or maybe impossible.

tome · on June 18, 2022

But now you have appealed to anthropomorphism (“intelligence”) to pose a problem yet forbidden anthropomorphism in an attempted counter argument. That doesn’t seem quite fair.

deadpannini · on June 18, 2022

I don't intend to forbid anything - I just think the language of motivation and desire makes it harder to see the risks, because it introduces irrelevant questions into the the conversation like "how can machines want something?"

Conversely, at least in this discussion, the term "intelligence" seems pretty neutral.

tome · on June 18, 2022

> I just think the language of motivation and desire makes it harder to see the risks, because it introduces irrelevant questions into the the conversation like "how can machines want something?"

Yet discourse on existential AI risks is predicated on something like a "goal" (e.g. to maximise paperclips). Notions like "goal" also make it harder to see clearly what we are actually discussing.

> the term "intelligence" seems pretty neutral

Hmm, I'm not convinced. It seems like an extremely loaded term to me.

deadpannini · on June 18, 2022

AIs absolutely do have goals, determined by their reward functions.

Yes, "intelligence" is a deeply loaded term. It just doesn't matter in the context of the discussion here, so far as I've seen.its ambiguities haven't been relevant.

astrange · on June 18, 2022

> AIs absolutely do have goals, determined by their reward functions.

You're confusing "AIs" (existing ML models) with "AGIs" (theoretical things that can do anything and are apparently going to take over the world). Not only is there not proof AGIs can exist, there isn't proof they can be made with fixed reward functions. That would seem to make them less than "general".

eru · on June 20, 2022

Compare https://www.gwern.net/Tool-AI

DalasNoin · on June 18, 2022

You seemingly are portraying people who worry about long term risks of ai as members of a religious cult. But you also acknowledge that AI could end humanity? The question of why AI would want to kill us has been addressed by other people before, simplified: your atoms are useful for many objectives. Humans use resources and might plot against you.

astrange · on June 18, 2022

> You seemingly are portraying people who worry about long term risks of ai as members of a religious cult.

Strictly speaking, we can limit that to people who rearrange their lives around reacting to the possibility, even in sillier (yet not disprovable) forms like Roko's Basilisk.

People who believe having a lot of "intelligence" means you can actually do anything you intend to do, no matter what that thing is, also get close to it because they both involve creating a perfect being in their minds. But that's possible for anyone - I guess it comes from assuming that since an AGI would be a computer + a human, it gets all the traits of humans (intelligence and motivation) plus computer programs (predictable execution, lack of emotions or boredom). It doesn't seem like that follows though - boredom might be needed for online learning, which is needed to be an independent agent, and might limit them to human-level executive function.

The chance of dumb civilization-ending mistakes like nuclear war seems higher than smart civilization-ending mistakes like gray goo, and can't be defended against, so as a research direction I suggest finding a way to restore humans from backup. (https://scp-wiki.wikidot.com/scp-2000)