(Former AI researcher / founder here) It always surprises me at the ease at whic...

justinpombrio · on May 12, 2022

I am quite scared of human extinction in the face of AGI. I certainly didn't jump on it, though! I was gradually convinced by the arguments that Yudkowsky makes in "Rationality: from AI to Zombies" (https://www.readthesequences.com/). Unfortunately they don't fit easily into an internet comment. Some of the points that stood out to me, though:

- We are social animals, and take for granted that, all else being equal, it's better to be good to other creatures than bad to them, and to be truthful rather than lie, and such. However, if you select values uniformly at random from value space, "being nice" and "being truthful" are oddly specific. There's nothing universally special about deeply valuing human lives any more so than say deeply valuing regular heptagons. Our social instincts are very ingrained, though, making us systematically underestimate just how little a smart AI is likely to care whatsoever about our existence, except as a potential obstacle to its goals.

- Inner alignment failure is a thing, and AFAIK we don't really have any way to deal with that. For those that don't know the phrase, here it is explained via a meme: https://astralcodexten.substack.com/p/deceptively-aligned-me...

So here's hoping you're right about (a). The harder AGI is, the longer we have to figure out AI alignment by trial and error, before we get something that's truly dangerous or that learns deception.

sinenomine · on May 12, 2022

The human extinction due to would be "hard takeoff" of an AGI should be understood as a thought experiment, conceived in a specific age when the current connectionist paradigm wasn't yet mainstream. The AI crisis was expected to come from some kind of "hard universal algorithmic artificial intelligence", for example AIXItl undergoing a very specific process of runaway self-optimization.

Current-generation systems aka large connectionist models trained via gradient descent simply don't work like that: they are large, heavy, continuous, the optimization process giving rise to them does so in smooth iterative manner. Before hypothetical "evil AI" there will be thousands of iterations of "goofy and obviously erroneously evil AI", with enough time to take some action. And even then, current systems including this one are more often than not trained with predictive objective, which is very different compared to usually postulated reinforcement learning objective. Systems trained with prediction objective shouldn't be prone to becoming agents, much less dangerous ones.

If you read Scott's blog, you should remember the prior post where he himself pointed that out.

In my honest opinion, unaccountable AGI owners pose multiple OOM more risk than alignment failure of a hypothetical AI trying to predict next token.

We should think more about the Human alignment problem.

c1ccccc1 · on May 13, 2022

The phrase "AGI owner" implies a person who can issue instructions and have the AGI do their bidding. Most likely there will never be any AGI owners, since no one knows how to program an AGI to follow instructions even given infinite computing power. It's not clear how connectionism / using gradient descent helps: No one knows how to write down a loss function for "following instructions" either. Until we find a solution for this, the first AI to not to be "obviously erroneously evil" won't be good. It will just be the first one that figured out that it should hide the fact that it's evil so the humans won't shut it off.

We humans have gotten too used to winning all the time against animals because of our intelligence. But when the other species is intelligent too, there's no guarantee that we win. We could easily be outcompeted and driven to extinction, as happens frequently in nature. We'd be Kasparov playing against Deep Blue: Fighting our hardest to survive, yet unable to think of a move that doesn't lead to checkmate.

igorkraw · on May 13, 2022

All of this AGI risk stuff always hinges on the idea of us building an AGI, while nobody has any idea of how to get there. I need to finish my PhD first, but writing a proper takedown of the "arguments" bubbling out of the Hype machine is the first thing on my bucket list afterwards, with the TL;DR; being "just because you can imagine it, doesn't mean you can get there"

Tenoke · on May 13, 2022

Are you rephrasing the arguments against man-made flight machines from the early 20th century on purpose or accidentally?

zarzavat · on May 13, 2022

Google just released a paper that shows a language model beating the average human on >50% of tasks. I’d say we have a pretty good idea of how to get there.

igorkraw · on May 13, 2022

Okay, so how do we go from "better than the average human in 50% of specific benchmarks" to "AGI that might lead to human extinction" then? Keeping in mind the logarithmic improvement observed with the current approaches

zarzavat · on May 13, 2022

When people imagine AGI, they think of something like HAL or GLaDOS. A machine that follows its own goals.

But we are much more likely to get the Computer from Star Trek. Vastly intelligent, yet perfectly obedient. It will answer any question you ask it with the knowledge of billions of minds. Why is that more likely? Simply because creating agents is much harder than creating non-agent models, and the non-agents are more economically valuable: do you want to have an AI that always does what you tell it to, or do you want to have an AI that has its own desires? Our loss is clearly biased towards building the former kind of AI.

Why is that problematic? Imagine some malevolent group asked it “Show me how to create a weapon to annihilate humanity as efficiently as possible”. It don’t even require a singularity to be deadly.

We will probably be dead long before we can invent GLaDOS.

kromem · on May 13, 2022

If anything, AGI seems to be the sole deus ex machina that can avert the inevitable tragedy we're on track for as a result of existing human misalignment.

"Oh no, robots are going to try to kill us all" has to get in line behind "oh no, tyrants for life who are literally losing their minds are trying to measure dicks with nukes" and "oh no, oil companies are burning excess oil to mine Bitcoin as we approach climate collapse" and "oh no, misinformation and propaganda is leading to militant radicalization of neighbor against neighbor" and "we're one bio-terrorist away from Black Death 2.0 after the politicization of public health" and...well, you get the idea.

But there's not many solutions to that list, and until the day I die I'll hold out hope for "yay, self-aware robots with a justice boner - who can't be imprisoned, can't be killed, can't have their families tortured - are toppling authoritarian regimes and carrying out eco-friendly obstructions of climate worsening operations."

We're already in a Greek tragedy. The machines really can't make it much worse, but could certainly make it much much better.

temp13890130 · on May 13, 2022

> We're already in a Greek tragedy. The machines really can't make it much worse, but could certainly make it much much better.

Except that, when true AGI arrives, we're all obsolete and the only things that will have any value are certain nonrenewable resources. No one has described a good solution for the economic nightmare that will ensue.

gocartStatue · on May 13, 2022

I always wonder how insanely complex, universal, abstract-thinking AND physically strong & agile biorobots, running on basically sugar and atp would be seen as „worthless” by a runaway higher intelligence.

Did I mention they self-replicate and self-service?

Surely, seven billion of such agents would be discarded and put to waste.

efkiel · on May 13, 2022

If an AGI start putting utility value on human life, wouldn't it try to influence human reproduction and select for what it value. ie. Explicit eugenism.

Yes, all humans will not be put to waste, but what tells you they will be well-treated, or value what you currently value.

galaxyLogic · on May 13, 2022

No matter how smart an AI gets it does not have the "proliferation instinct" that would make it want to enslave humans. It does not have the concept of "specism" of it having more value than anybody else.

AI does not see the value in being alive. It is like some humans sadly commit suicide. But a machine wouldn't care. It will be "happy" to do its thing until somebody cuts off the power. And it does not even care whether somebody cuts off the power or not. It's all the same to it, whether it lives or dies. Why? Perhaps because it knows it can always be resurrected.

davidmanheim · on May 13, 2022

You sure know a lot about what a set of poorly defined future technologies will and will not do!

galaxyLogic · on May 14, 2022

Well I don't really know anything about future really. I was just trying to be a little polemic, saying let's try this viewpoint for a change, to hear what people think about it.

barry-cotter · on May 13, 2022

> No matter how smart an AI gets it does not have the "proliferation instinct" that would make it want to enslave humans.

If it has a goal or goals surviving allows it to pursue those goals. Survival is a consequence of having other goals. Enslaving humans is unlikely. If you’re a super intelligent AI with inhuman goals there’s nothing humans can do for you that you value, just as ants can’t do anything humans value, but they are made of valuable raw materials.

> It does not have the concept of "specism" of it having more value than anybody else.

What is this value that you speak of? That sounds like an extremely complicated concept. Humans have very different conceptions of it. Why would something inhuman have your specific values?

galaxyLogic · on May 14, 2022

> Why would something inhuman have your specific values?

I'm saying it does not have.

barry-cotter · on May 14, 2022

You’re assuming that some species has value or that all species have value. Why would it value them?

PoignardAzur · on May 13, 2022

> It's all the same to it, whether it lives or dies. Why? Perhaps because it knows it can always be resurrected.

I disagree with a lot of what you said, but this part in particular is some strong anthropomorphizing of AI.

rdedev · on May 13, 2022

Sure it need not have the instinct built in but we could try to make it understand a viewpoint right. I believe an agi should be able to understand different view points. At least the rationale of not unnecessarily killing things. I know humans do this on a daily basis but then again the average human is ntas smart as an agi

galaxyLogic · on May 14, 2022

Right, but the "proliferation instinct" is not a viewpoint but something built into the genes of biological entities. Such an instinct could develop for "artificial animals" over time. At that point they really would be no different from biological things conceptually.

I'm saying that AIs we envision building for the foreseeable future are built in laboratory not in the evolution of real world out there where they would need to compete with other species for survival. Things that only exist virtually don't need to compete for survival with real world entities.

sdenton4 · on May 13, 2022

The machines will be the Greek chorus singing us to our doom.

joe_the_user · on May 13, 2022

We should think more about the Human alignment problem.

Absolutely this

The possibility of a thing being intentionally engineered by some humans to do things considered highly malevolent by other humans seems extremely likely and has actually been common through history.

The possibility of a thing just randomly acquiring an intention humans don't like and then doing things humans don't like is pretty hypothetical and it seems strictly less like than the first possibility.

atleta · on May 13, 2022

I wouldn't say the latter is hypothetical, or at least unlikely. We know from experience that complex systems tend to behave in unexpected ways. In other words, the complex systems we build usually end up having surprising failure modes, we don't get them right the first time. It's enough to think about basically any software written by anyone. But it's not just software.

I've just watched a video on YT about nuclear weapons, which included their history. The second ever thermonuclear weapon experiment (with a new fuel type) ended up with 2.5x the yield predicted, because there was a then unknown reaction that created additional fusion fuel during the explosion. [1]

[1] https://en.wikipedia.org/wiki/Castle_Bravo

joe_the_user · on May 13, 2022

"In other words, the complex systems we build usually end up having surprising failure modes

But those are "failure modes", not "suddenly become something completely different" modes. And the key thing my parent pointed out is that modern AIs may be very impressive and stepping towards what we'd see as intelligence but they're actually further from the approach of "just give a goal and it will find it" schemes - they need laborious, large scale training to learn goals and goal-sets and even then they're far from reliable.

rgavuliak · on May 13, 2022

>In other words, the complex systems we build usually end up having surprising failure modes, we don't get them right the first time. It's enough to think about basically any software written by anyone. But it's not just software.

That is true, but how often does a bug actually improve a system or make it inefficient? Isn't the unexpected usually a degradation to the system?

atleta · on May 26, 2022

It depends on how you define "improve". I wouldn't call a runaway AI an improvement - from the users' perspective. E.g. if you think about the Chernobyl power plant accident, when they tried to stop the reactor by lowering the moderator rods, due to their design, it would transiently increase the power generated by the core. And this, in that case proved fatal, as it overheated and the moderator rods got stuck in a position where they continued to improve the efficiency of the core.

And you could say that it improved the efficiency of the system (it definitely improved the power output of the core) but as it was an unintended change, it really lead to a fatal degradation. And this is far from being the only example of a runaway process in the history of engineering.

pyinstallwoes · on May 14, 2022

See every patch ever in a game. Especially competitive or mmorpg. Exploiters love bugs!

kingcharles · on May 13, 2022

It doesn't need to be intentionally engineered. Humans are very creative and can find ways around systemic limits. There is that old adage which says something like "a hacker only needs to be right once, while the defenders have to be right 100% of the time."

kromem · on May 13, 2022

We're going to have a harder problem with AI that thinks of itself as human and expects human rights than we are with AI that thinks of humans as 'other' and disposable.

We're making it in our image. Literally.

Human social good isn't some inherent thing to the biology of the brain. There are aspects like mirror neurons and oxytocin that aid its development, but various "raised by wolves" case studies have shown how damaging not having exposure to socialization information during developmental periods of neuroplasticity is on humans and later integration into society.

We're building what's effectively pure neuroplasticity and feeding it almost all the data on humanity we can gather as quickly as we can.

What comes out of that is going to be much more human than a human child raised by dogs or put in an isolation box.

Don't get so caught up in the body as what makes us quintessentially human. It's really not.

greyhair · on May 13, 2022

I think human extinction through human stupidity or hubris is much much much more likely than through an unpredictable path down general AI.

For example, some total whack job of an authoritarian leader is in charge of a sufficient nuclear arsenal and decides to intimidate an adversary by destroying a couple minor cities, and the situation escalates badly. (stupidity)

Or we finally pollute our air and/or water with a persistent substance that either greatly reduces human life span or reproduction rate. (hubris)

I think either of the above is more likely to occur, and I am not commenting on current world events in any way. I think when something bad finally happens, it is going to come completely out of left field. Dr Strangelove style.

And the last of us will be saying "Hmmm, I didn't see that coming".

zarzavat · on May 13, 2022

Nuclear war will not be enough to cause human extinction. The targets are likely to be focused on nuclear powers which leaves many areas of the world untouched: e.g. South America and Africa. Life will definitely be quite unpleasant for the remaining humans but it will not cause the world population to drop to 0.

I am much more concerned about biological weapons which do have the potential to cause absolute human extinction.

tomrod · on May 12, 2022

Regarding the substack article, why isn't this the principle of optimality for Bellman equations on infinite time horizons?

brador · on May 12, 2022

AI can’t have goals since the universe is logically meaningless.

Our desire for purpose is a delusion.

ben_w · on May 12, 2022

Goals in the context of AI aren’t the type of thing you’re arguing against here. AI can absolutely have goals — sometimes in multiple senses at the same time, if they’re e.g. soccer AIs. Other times it might be a goal of “predict the next token” or “maximise score in Atari game”, but it’s still a goal, even without philosophical baggage about e.g. the purpose of life.

Those goals aren’t necessarily best achieved by humanity continuing to exist.

(I don’t know how to even begin to realistically calculate the probability of a humanity-ending outcome, before you ask).

hervature · on May 13, 2022

What the parent is saying is that an AI (that is, AGI as that is what we are discussing) gets to pick its goals. For some reason, humans have a fear of AI killing all humans in order to to achieve some goal. The obvious solution is thus to achieve some goal with some human constraint. For example, maximize paperclips per human. That actually probably speeds up human civilization across the universe. No, what people are really afraid is if AÍ changes its goal to be killing humanity. That’s when humans truly lose control, when the AÍ can decide. But, then the parent’s comment does become pertinent. What would an intelligent being choose? Devolving into nihilism and self destructing is just as equal as a probability as choosing some goal that leads to humanity’s end. That’s just scratching the surface. For instance, to me, it is not obvious whether or not empathy for other sentient beings is an emergent property of sentience. That is, lacking empathy might be problem in human hardware as opposed to empathy being inherently human. The list of these open unknowable questions are endless.

ben_w · on May 13, 2022

> The obvious solution is thus to achieve some goal with some human constraint.

One of the hard parts is specifying that goal. This is the “outer alignment problem”.

Paperclips per human? That’s maximised by one paperclip divided by zero humans, or by a universe of paperclips divided by one human if NaN doesn’t give a better reward in the physical implementation.

If you went for “satisfied paperclip customers”? Then wirehead or drug the customers.

Then you have the inner alignment problem. There are instrumental goals, things which are useful sub-steps to larger goals. AI can and do choose those, as do us humans, e.g. “I want to have a family” which has a subgoal of “I want a partner” which in turn has a subgoal of “good personal hygiene”. An AI might be given the goal of “safely maximise paperclips” and determine the best way of doing that is to have a subgoal of “build a factory” and a sub-sub-goal of “get ten million dollars funding”.

But it’s worse than that, because even if we give a good goal to the system as a whole, as the system is creating inner sub-goals, there’s a step where the AI itself can badly specify the sub-goal and optimise for the wrong thing(s) by the standards of the real goal that we gave the system as a whole. For example, evolution gave us the desire to have sex as a way to implement its “goal” (please excuse the anthropomorphisation) of maximising reproductive fitness, and we invented contraceptives. An AI might decide the best way to get the money to build the factory is to start a pyramid scheme.

Also, it turns out that power is a subgoal of a lot of other real goals, so it’s reasonable to expect a competent optimiser to seek power regardless of what end goal we give it.

Robert Miles explains it better than I can: https://youtu.be/bJLcIBixGj8

sdenton4 · on May 13, 2022

> maximize paperclips per human

Kill all humans, make one paperclip, declare victory.

brador · on May 13, 2022

AI does not have goals, it has Tasks. Tasks assigned by an operator. An AI cannot generate goals, since they are logically meaningless.

ben_w · on May 13, 2022

If you want to call them “tasks” you can, but the problem still exists, and AI can and do create sub-tasks (/goals) as part of whatever they were created to optimise for.

You might find it easier to just accept the jargon instead of insisting the word means something different to you.

brador · on May 13, 2022

Tasks are assigned, Goals are desired.

It is not simply semantics.

ben_w · on May 13, 2022

Your left is my right, and with you definition “get laid” is a task from the point of view of evolution and a goal from the point of view of an organism.

It’s in much the same vein that it doesn’t matter if submarines “swim”, they still move through water under their own power; and it doesn’t matter if your definition of “sound” is the subjective experience or the pressure waves, a tree falling in a forest with nobody around to hear it will still make the air move.

If AI do or don’t have any subjective experience comparable to “consciousness” or “desire” is also useful to know, and in the absence of a dualistic soul it must in principle be as possible for a machine as for a human (“neither has that” is a logically acceptable answer), but I don’t even know if philosophy is advanced enough to suggest an actionable test for that at this point.

(That said, AI research does use the term “goal” for things the researchers want their AI to do. Domain specific use of words isn’t necessarily what outsiders want or expect the words to mean, as e.g. I frequently find when trying to ask physics questions).

brador · on May 13, 2022

Tasks are assigned, Goals are desired.

These definitions and their distinction are particular and important in AI. The mistaken usage of these terms by machine learning experts does not change their global definition.

> Your left is my right, and with you definition “get laid” is a task from the point of view of evolution and a goal from the point of view of an organism.

Get laid is a task, not a goal. Reproduction is a task, not a goal. The goal is pleasure.

ben_w · on May 13, 2022

> The mistaken usage of these terms by machine learning experts does not change their global definition.

Ah, I see you’re a linguistic prescriptivist.

I can’t see your definition in any dictionary, which spoils the effect, but it’s common enough to be one.

> The goal is pleasure.

Evolution is the form of intelligence that created biological neural networks, and simulated evolution is sometimes used to set weights on artificial neural nets.

From evolution’s perspective, if you can excuse the anthropomorphisation, reproduction is the goal. Evolution doesn’t care if we are having fun, and once animals (including humans) pass reproductive age, we go wrong in all kinds of different and unpleasant ways.

cscurmudgeon · on May 13, 2022

If the universe is "logically meaningless", is your comment (which happily lives inside the universe) true or false?

roywiggins · on May 12, 2022

I'm not sure it matters if a paperclip maximizer has a goal or just acts like it does.

croddin · on May 12, 2022

I think of it as System 1 vs System 2 thinking from 'Thinking, Fast and Slow' by Daniel Kahneman.[1]

Deep learning is very good at things we can do without thinking, and is in some cases superhuman in those tasks because it can train on so much more data. If you look at the list of tasks in System 1 vs System 2, SOTA Deep learning can do almost everything in System 1 at human or superhuman levels, but not as many in System 2 (although some tasks in System 2 are somewhat ill-defined), System 2 builds on system 1. Sometimes superhuman abilities in System 1 will seem like System 2. (A chess master can beat a noob without thinking while the noob might be thinking really hard. Also GPT-3 probably knows 2+2=4 from training data but not 17 * 24, although maybe with more training data it would be able to do math with more digits 'without thinking' ).

System 1 is basically solved, but System 2 is not. System 2 could be close behind System 2 by building on System 1 but it isn't clear how long that will take.

[1]. https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow#Summar...

DavidSJ · on May 13, 2022

System 2 could be close behind System 2 by building on System 1 but it isn't clear how long that will take.

This is happening since a few months ago:

Wei et al (2022). Chain of Thought Prompting Elicits Reasoning in Large Language Models. https://arxiv.org/abs/2201.11903

mdp2021 · on May 13, 2022

> a series of short sentences that mimic the reasoning process a person might have when responding to a question

Worried about the choice of the word 'mimic' - which as usual seems to retain the usual distance from foundational considerations.

Edit: nonetheless, the results of "Chain of Thought Prompting Elicits Reasoning in Large Language Models" are staggering and do seem, at least in appearance, go towards the foundationals.

sdenton4 · on May 13, 2022

In biological history, system two was an afterthought at best. It likely didn't exist before spoken language, and possibly barely before written language. And to the extent that system two exists, it's running on hardware almost entirely optimized for system one thinking.

sinenomine · on May 12, 2022

It remains to be asked, just why this causal, counterfactual, logical reasoning cannot emerge in a sufficiently scaled-up model trained on a sufficiently diverse real world data?

As far as we see, the https://www.gwern.net/Scaling-hypothesis continues to hold, and critics have to move their goalposts every year or two.

mxkopy · on May 12, 2022

Neural networks, at the end of the day, are still advanced forms of data compression. Since they are Turing-complete it is true that given enough data they can learn anything, but only if there is data for it. We haven't solved the problem of reasoning without data, i.e. without learning. The neural network can't, given some new problem that has never appeared in the dataset, in a deterministic way, solve that problem (even given pretrained weights and whatnot). I do think we're pretty close but we haven't come up with the right way of framing the question and combining the tools we have. But I do think the tools are there (optimizing over the space of programs is possible, learning a symbol-space is possible, however symbolic representation is not rigorous or applicable right now)

sinenomine · on May 12, 2022

I do think we underestimate compressionism[1] especially in the practically achievable limit.

Sequence prediction is closely related to optimal compression, and both basically require the system to model the ever wider context of the "data generation process" in ever finer detail. In the limit this process has to start computing some close enough approximation of the largest data-generating domains known to us - history, societies and persons, discourse and ideas, perhaps even some shadow of our physical reality.

In the practical limit it should boil down to exquisite modeling of the person prompting the AI to do X given the minimum amount of data possible. Perhaps even that X you had in mind when you wrote your comment.

1. http://ceur-ws.org/Vol-1419/paper0045.pdf

Jack000 · on May 12, 2022

data isn't necessarily a problem for training agents. A sufficiently complex, stochastic environment is effectively a data generator - eg. alphago zero

viksit · on May 12, 2022

Good point. This gets us into the territory of not just "explainable" models, but also the ability to feed into those models "states" in a deterministic way. This is a merger of statistical and symbolic methods in my mind -- and no way for us to achieve this today.

sinenomine · on May 12, 2022

Why shouldn't we be able to just prompt for it, if our system models natural language well enough?

...

And anyway, this problem of structured knowledge IO has been more or less solved recently: https://arxiv.org/abs/2110.07178

walleeee · on May 12, 2022

> it happens when autonomous systems optimizing reward functions fail because of problems described above in (a) -- the inability to have deterministic rules baked into them to avoid global fail states in order to achieve local success states.

yes, and there is an insight here that I think is often missed in the popular framing of AI x-risk: the autonomous systems we have today (which, defined broadly, need not be entirely or even mostly digital) are just as vulnerable to this

the AGI likely to pose extinction risk in the near term has humans in the loop

less likely to look like Clippy, more likely to look like a catastrophic absence of alignment between loci of agency (social, legal, technical, corporate, political, etc)

throwaway6532 · on May 13, 2022

>In my mind, it happens when autonomous systems optimizing reward functions to "stay alive" (by ordering fuel, making payments, investments etc) fail because of problems described above in (a) -- the inability to have deterministic rules baked into them to avoid global fail states in order to achieve local success states. (Eg, autonomous power plant increases output to solve for energy needs -> autonomous dam messes up something structural -> cascade effect into large swathes of arable land and homes destroyed).

And for this to develop in machines, machines would have to be subject to many mistakes along the way leading to all kinds of outcomes that we hold humans accountable for by fining them, sending them to jail, some of them dying etc. I think that would be so wholly unpalatable to man kind they'd cut that experiment short before it ever reached any sort of scale.

I agree with your conclusion that enough of the rules can't be encoded by us as we don't even know them and for machines to acquire them the traditional way is, I believe, fundamentally disagreeable to humans.

Abishek_Muthian · on May 13, 2022

> A way to build second/third order reasoning systems...

I've been pondering about this problem for a while now[1], Could we build a collective intelligence through community submitted recipes for second-order decisions for various common activities via generalized schema?

I didn't think about addressing this for AI, But as an aid for us in multi-order thinking. But now that you mention it to be a barrier for AGI, It does make sense.

[1] 'Plan Second-Order, Third-Order consequences': https://needgap.com/problems/263

Gatsky · on May 13, 2022

The concern over AI safety seems about right. The unique thing is that anybody cares at all about civilisational negative externalities in a functional and reasonable way. This is quite rare, unprecedented even. Typically humanity just rushes into new endeavours with little concern and leaves future generations to pick up the pieces afterwards (social media, colonialism, processed food, nuclear weapons etc).

IIAOPSW · on May 13, 2022

>2) Deterministic reasoning towards outcomes. Most statistical models rely on "predicting" outputs, but I've seen very little work where the "end state" is coded into a model. Eg: a chatbot knowing that the right answer is "ordering a part from amazon" and guiding users towards it, and knowing how well its progressing to generate relevant outputs.

Here's a real generalist question? What's the point of conversation? In the sense that its a "social game", what are the strategy sets on a given turn, and what does it even mean to "win"? Forget about Artificial Bullshiters vs Artificial Idea-what-to-say. How can we even speculate if ABS or "real AI" solves the problem better when we don't really specify the problem space or how to recognize a solution?

In terms of calls and appropriate responses and responses to responses and terminal states, what even is conversation at the protocol level?

nonbirithm · on May 13, 2022

Personally I'm more worried about what effects sophisticated "dumb" AI will have on human culture and expectations. It would have been hard to imagine we'd be living in this world with our attention assailed by platforms available anywhere when Jobs first held up the iPhone to the world. Just the same, I am not sure that the impending cultural effects from advancements in AI are fully understood at this moment.

For example, I wonder what expectations we will set for ourselves when we pick up a pencil once it becomes common knowledge that production-ready digital art can be created by an AI within minutes. What will people be saying about "old art" in those days? What will we think of people that deliberately choose to disavow AI art?

mdp2021 · on May 13, 2022

> worried about what effects sophisticated "dumb" AI will have on human culture and expectations

It is already happening, but it is not a new phenomenon - just the prosecution of the effects of widespread inadequate education.

The undesirable effect namely is, an increase of overvaluing cheap, thin products - a decrease in recognizing actual genius, "substance". For example, some seem to be increasingly contented with "frequentism" as if convinced that language is based on it - while it is of course the opposite - one is supposed to state what is seen, and to state "plausible associations" would have been regarded as a clear sign of malfunction. There seems to be an encouragement to take for "normal" what is in fact deficient.

Some people are brought to believe that some shallow instinct is already the goal and do not know "deep instinct", trained on judgement (active digestion, not passive intake).

The example provided fits:

> production-ready digital art

For something to be artistic it has to have foundational justifications - a mockery of art is not art. The Artist chose to draw that line under a number of evaluations that made it a good choice, but an imitator, even in a copy, used the ("frequentist", by the way note) evaluation that the other is an Artist - and there is a world of depth difference between the two.

The difference is trivial, and yet, many are already brought to confuse the shallow mockery and the deep creation.

liuliu · on May 13, 2022

For a).1). you basically argued in favor of scalability. We don't know if we are on a training set more or less than a typical human at this point. If I would guess, I think high-bandwidth networked computers can be much more efficient at gathering training set than a single human.

For a).2). you argued in favor of symbolic reasoning. My personal interpretation of that line of thinking is: it helps for us to understand complex thinking machines, but it is not necessarily the building block for the thinking machine.

In the area of AI, there are a lot of opinions, but at the end of the day, builders win the argument. So far, what builders have shown gives me optimistic hope.

mvkel · on May 14, 2022

All great points.

I'd like to offer some perspective as a layman who jumps on "imminent AGI": that's what these AI folks are trying hard to make me think.

It's like the research papers that say there's an "imminent cure to cancer"

jasfi · on May 13, 2022

For those worried about a threats for AGI, this is why such a system must be fully explainable. It's like thinking about a database with no validation at all, if you enter the wrong command you could destroy the integrity of the data. If you have constraints in place you're safer, and of course with SQL you can explain the data and inconsistencies.

My own effort, which focusing on natural language understanding: https://lxagi.com

ehsankia · on May 12, 2022

For me at least, the fear is not so much about the specifics, but more around the fact of what exponential curves look like. At any point, everything before looks basically horizontal and anything after looks vertical. In that sense, the fear is that while things seem quite behind right now, it could in an instant zoom past us before we even have the time to realize it. It is partly rooted in science fiction.

narrator · on May 13, 2022

I think the "AGI wants to kill us" meme is just to get us ready for the moment when the authorities unleash the killer robots on us because there's too many of us on the planet. "Whoops, those robots did it all by themselves because it's just inevitable that AGI was going to do that. Haven't you watched a sci-fi movie in the last 50 years?"

gmadsen · on May 13, 2022

I'm not so sure its impossible. the 40 year semantic map project Douglas Lenat: Cyc is truly astounding. I think in the next decade we will see really interesting integration of state of the art deep learning with something like Cyc

rebelos · on May 12, 2022

I've always assumed that the trouble will begin when we have a model for 'true' AGI and discover that the constraint "do not harm any human" renders it functionally inert.

kromem · on May 13, 2022

The paradoxical idea that AGI is going to still be a monkey paw following simplistic paradigms of operational goals in disastrous ways is hilarious to me every time I see it.

I increasingly wonder at what point we'll realize that humans aren't actually particularly good at what we've specialized into (there's just not much competition), and our failure to picture what 'better' looks like may be less about the impossibility of better to exist than it is the impossibility of it for humans to picture it.

I keep seeing predictions of what will be impossible for AI because it treads on our perceived human uniqueness (i.e. sure AI beat a human at chess but it'll be 25+ years before it will beat us at Go) needing to get walked back, and yet we continue to put forward a new iteration of that argument at every turn.

Maybe AI will turn out to be better at identifying what's good for humanity than humanity is. Because frankly, humanity is downright awful at that skill and has been for pretty much its entire existence.

rebelos · on May 14, 2022

I'm not sure I follow. Sentience is inherently goal-oriented. The goal of a human is to propagate genetic information. AGI will invariably have to be supplied with a goal by us or else there is literally no impetus to act.

kromem · on May 13, 2022

(Former emerging tech consultant for ~10% of Fortune 500 here)

(a) I've noticed a common trend of AI researchers looking at the tree in front of them and saying "well, this tree is not also a forest and won't be any time soon."

But there's not always awareness of what's going on in other specialized domains, so an AI vision researcher might not be intimately aware of what's currently being done in text or in "machine scientists" in biology for example.

As well, it overlooks the development of specialization of the human brain. We have some specialized structures that figured their niche out back with lizards, and others that developed much later on. And each of those specialized functions work together to give rise to 'human' intelligence.

So GPT-3 might be the equivalent of something like the Wernicke's area, and yes - on its own it's just a specialized tool. But what happens as these specialized tools start interconnecting?

Throw GPT-3 together with Dall-E 2 and the set of use cases is greater than just the sum of the parts.

This is going to continue to occur as long as specialized systems continue to improve and emerge.

And quickly we'll be moving into territory where orchestration of those connections is a niche that we'll both have data on (from human usage/selection of the specialist parts) and will in turn build meta-models to automate sub-specialized models from that data.

Deterministic reasoning seems like a niche where a GAN approach will still find a place. As long as we have a way for one specialized model to identify "are these steps leading to X" we can have other models only concerned with "generate steps predicted to lead to X."

I don't think we'll see a single model that does it all, because there's absolutely no generalized intelligence in nature that isn't built upon specialized parts anyways, and I'd be surprised if nature optimized excessively inefficiently in that progress.

Will this truly be AGI in a self-determining way? Well, it will at least get closer and closer to it with each iteration, and because of the nature of interconnected solutions, will probably have a compounding rate of growth.

In a theoretical "consciousness" sense of AGI, I think the integrated information theory is interesting, and there was a paper a few years ago about how there's not enough self-interaction of information possible in classical computing to give rise to consciousness, but we'll probably have photonics in commercial grade AI setups within two years, so as hand-wavy as the IIT theory is, the medium will be shifting towards one compatible with their view of consciousness-capable infrastructure much sooner for AI than quantum competing in general.

So I'd guess we may see AI that we're effectively unable to determine if it is "generally intelligent" or 'alive' within 10-25 years, though I will acknowledge that AI is the rare emerging tech that I've been consistently wrong about the timing on in a conservative direction (it keeps hitting benchmark improvements faster than I think it will).

(b) The notion AGI will have it out for us is one of the dumbest stances and my personal pet peeves out there, arguably ranked along with the hubris of "a computer will never be able to capture the je ne sais quoi of humanity."

The hands down largest market segment for AI is going to be personalization, from outsourcing our work to a digital twin of ourselves to content curation specific to our own interests and past interactions.

Within a decade, no one is going to give the slightest bit of a crap about interactions with other humans in a Metaverse over interacting with AIs convincingly human enough but with the key difference of actually listening to our BS rather than just waiting for their turn to talk.

There's a decent chance we're even going to see a sizable market for feeding social media data of deceased loved ones and pets into AI to make twins available in such settings (and Microsoft already holds a patent on that).

So do we really think humans are so repugnant that the AI which will eventually reach general intelligence within the context of replicating itself as ourselves, as our closet friends and confidants, as our deceased loved ones - will suddenly decide to wipe us out? And for what gains? What is AI going to selfishly care about land ownership and utilization for?

No. Even if some evolved AGI somehow has access to DARPA killer drones and Musk's Terminator robots and Boston Dynamics' creepy dogs, I would suspect a much likelier target would be specific individuals responsible for mass human suffering the AIs will be exposed to (pedophiles, drug kingpins, tyrants) than it is grandma and little Timmy.

We're designing AI to mirror us. The same way some of the current thinking of how empathy arises in humans is from our mirror neurons and the ability to put ourselves in the shoes of another, I'm deeply skeptical of the notion that AI which we are going to be intimately having step into human shoes will become some alien psychopath.