Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> It is clearly imaginable that a very intelligent agent could end humanity if its objective would require so.

This is quite possible. Indeed, I don't believe this is exclusive to superintelligence or requires it at all. Compare to the closest thing we have to "inventing AGI" - having babies. People do that all the time and there isn't a mathematical guarantee that baby won't end humanity, but we don't do much to stop it, and that's not considered a problem. Mainly, why would it want to?

https://twitter.com/thejadedguy/status/844352570470645760?la...

I don't think superintelligence even gives them much advantage if they wanted to. Being able to imagine a virus real good doesn't actually have much to do with the ability to create one, since plans tend to fail for surprising reasons in the real world once you start trying to follow them. Unless you define superintelligence as "it's right about everything all the time", but that seems like a magical power, not something we can invent.

> How exactly is "perpetual motion machines can't exist" related to this?

It wouldn't be able to do the particular kind of ending humanity where you turn them all into paperclips, though it could do other things. There's plenty of ways to do it that reduce entropy rather than increase it - nuclear winter is one.



"Mainly, why would it want to [end humanity]?"

The anthropomorphism is misleading. No one expects that an AGI would "want to" in the commonplace sense of being motivated by animosity, fear, or desire. The problem is that the best path to satisying its reward function could have adverse-to-extinction level consequences for humanity, because alignment is hard, or maybe impossible.


But now you have appealed to anthropomorphism (“intelligence”) to pose a problem yet forbidden anthropomorphism in an attempted counter argument. That doesn’t seem quite fair.


I don't intend to forbid anything - I just think the language of motivation and desire makes it harder to see the risks, because it introduces irrelevant questions into the the conversation like "how can machines want something?"

Conversely, at least in this discussion, the term "intelligence" seems pretty neutral.


> I just think the language of motivation and desire makes it harder to see the risks, because it introduces irrelevant questions into the the conversation like "how can machines want something?"

Yet discourse on existential AI risks is predicated on something like a "goal" (e.g. to maximise paperclips). Notions like "goal" also make it harder to see clearly what we are actually discussing.

> the term "intelligence" seems pretty neutral

Hmm, I'm not convinced. It seems like an extremely loaded term to me.


AIs absolutely do have goals, determined by their reward functions.

Yes, "intelligence" is a deeply loaded term. It just doesn't matter in the context of the discussion here, so far as I've seen.its ambiguities haven't been relevant.


> AIs absolutely do have goals, determined by their reward functions.

You're confusing "AIs" (existing ML models) with "AGIs" (theoretical things that can do anything and are apparently going to take over the world). Not only is there not proof AGIs can exist, there isn't proof they can be made with fixed reward functions. That would seem to make them less than "general".



You seemingly are portraying people who worry about long term risks of ai as members of a religious cult. But you also acknowledge that AI could end humanity? The question of why AI would want to kill us has been addressed by other people before, simplified: your atoms are useful for many objectives. Humans use resources and might plot against you.


> You seemingly are portraying people who worry about long term risks of ai as members of a religious cult.

Strictly speaking, we can limit that to people who rearrange their lives around reacting to the possibility, even in sillier (yet not disprovable) forms like Roko's Basilisk.

People who believe having a lot of "intelligence" means you can actually do anything you intend to do, no matter what that thing is, also get close to it because they both involve creating a perfect being in their minds. But that's possible for anyone - I guess it comes from assuming that since an AGI would be a computer + a human, it gets all the traits of humans (intelligence and motivation) plus computer programs (predictable execution, lack of emotions or boredom). It doesn't seem like that follows though - boredom might be needed for online learning, which is needed to be an independent agent, and might limit them to human-level executive function.

The chance of dumb civilization-ending mistakes like nuclear war seems higher than smart civilization-ending mistakes like gray goo, and can't be defended against, so as a research direction I suggest finding a way to restore humans from backup. (https://scp-wiki.wikidot.com/scp-2000)




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: