I think this is a classic case of us overestimating the immediate impact and und...

uxcolumbo · on April 7, 2023

I”m not in ML, so excuse this maybe naive question:

> get hundreds of LLMs supervising teams of millions of LLMs

What does this mean or what can you do with this setup… do you mean running LLMs in parallel?

jacquesm · on April 7, 2023

Yes, that's called 'ensembling'. There is a lot of work being done on this kind of solution. One way in which it could work is that you can use multiple models that have been fine tuned for various problems and then use the answer that returns the highest confidence.

jazzyjackson · on April 7, 2023

You can also have adverserial generation where models given different expertises and attitude can go back and forth criticizing each others work

namaria · on April 7, 2023

Sounds like the 'dead internet' is just around the corner!

sdenton4 · on April 7, 2023

Ask the LLM to perform a complex task by splitting it into sub tasks to be performed by other LLM instances, then integrate the results...

shmoogy · on April 7, 2023

Is this something like langchain is working towards?

arthurcolle · on April 7, 2023

AutoGPT, BabyAGI

SanderNL · on April 7, 2023

We call that “companies”. We just need to apply what we learned in business school to a different set of workers, slightly deficient workers.

Eisenstein · on April 7, 2023

> We just need to apply what we learned in business school

Please don't. You've already ruined enough industries. Let the MBAs do finance and Wall Street and leave them out of the chain of command in organizations that make things.

brookst · on April 7, 2023

Every time you go to the store and find that the store is still in business and there is food on the shelf, it is because someone went to business school and knows how to optimize demand estimation, pricing, and logistics.

Yes, some MBAs fuck things up. Just like some CS grads fuck things up. But advocating against the study of business is just as naive as advocating against the study of computer science just because there are some bad CS grads.

Eisenstein · on April 7, 2023

> Every time you go to the store and find that the store is still in business and there is food on the shelf, it is because someone went to business school

Are you contending that business were not successful before Wharton started pumping out MBAs?

> But advocating against the study of business is just as naive as advocating against the study of computer science

I didn't say 'don't study business', I said 'stick to finance'. MBAs tend to end up destroying innovation and productivity for short term growth and stats.

Jack Welch showed what a successfully motivated 'business oriented' leader can do to an innovative and productive legacy organization when given complete control over it. The MBAs happen to just do it on a smaller scale.

tharne · on April 7, 2023

> advocating against the study of business is just as naive as advocating against the study of computer science just because there are some bad CS grads.

Criticizing garbage MBA programs is not criticizing the study of business. Business schools don't study business. They're a place where people make a lot of money selling theories about business that are useless at best and it many places, quite harmful. Learning about business by going to business school is like learning to kiss by reading books about kissing.

alexvoda · on April 7, 2023

That is a great analogy.

I would say that just as every person is unique so is every company unique. And just as there is plenty of pseudoscience plaguing psychology so are MBAs full of pseudoscience. Two fields that are far too obsessed with generalising their advice. Which is not to say that there aren't any useful ideas in these fields. But the vitriolic reaction above is warranted.

ResearchCode · on April 7, 2023

Stores existed before the MBA, but MBAs could be why food prices are up 30% since last year.

SanderNL · on April 7, 2023

Take shelter under my protective wings, O, sweet summer children.

mach1ne · on April 7, 2023

>Eventually, someone will figure out how to get hundreds of LLMs supervising teams of millions of LLMs to do some really wild stuff that is currently completely impossible.

This is an intuitive direction. In fact, it’s so intuitive that it’s a little bit odd that nobody seems to have made proper progress with LLM swarm computation.

jrumbut · on April 7, 2023

I've read about people doing it, I haven't read about people achieving anything particularly interesting with it.

It's early days. There will be a GPT 5 I'm sure, maybe that one will be better at teamwork.

Bjartr · on April 7, 2023

This sounds like that old economics joke that says it's impossible to find $20 on the ground, because if it had been there, someone would have already picked it up.

contemplatter · on April 7, 2023

In particular, it's odd that the greatest software developer in the world (ChatGPT) hasn't made progress with LLM swarm computation.

loandbehold · on April 7, 2023

How is "LLM swarm computation" different that single bigger LLM?

SanderNL · on April 7, 2023

The same reason why you don't let Mr Musk do all the work. He can't.

One LLM is limited, one obvious limitation is its context window. Using a swarm of LLMs that each do a little task can alleviate that.

We do it too and it's called delegation.

Edit: BTW, "swarm" is meaningless with LLMs. It can be the same instance, but prompted differently each time.

ParetoOptimal · on April 7, 2023

> The same reason why you don't let Mr Musk do all the work. He can't.

Better to limit his incompetence to one position.

akiselev · on April 7, 2023

I beg to differ. Imagine him taking down Twitter, Facebook, Instagram, and all the others in one fell swoop!

int_19h · on April 7, 2023

Context window is a limitation, but have we actually hit the ceiling wrt scaling that? For GPT, you need O(N^2) VRAM to handle larger context sizes, but that is a "I need more hardware" problem ultimately; as I understand, the reason why they don't go higher is because of economic viability of it, not because it couldn't be done in principle. And there are many interesting hardware developments in the pipeline now that the engineers know exactly what kind of compute they can narrowly optimize for.

So, perhaps, there aren't swarms yet just because there are easier ways to scale for now?

SanderNL · on April 8, 2023

I am sure the context window can go up, maybe into the MB range. But I still see delegation as a necessary part of the solution.

For the same reason one genius human does not suddenly need less support staff, they actually need more.

Edit: and why it isn’t here yet is because it’s new and hard.

staunton · on April 7, 2023

It's easy to distribute across many computers which communicate with high latency

alexvoda · on April 7, 2023

LLMs are already running distributed on swarms of computers. A swarm of swarms is just a bigger swarm.

So again, what is the actual difference you are imagining?

Or is it just that distributed X is fashionable?

pixl97 · on April 7, 2023

Rather large parts of your brain are more generalized, but in particular places we have more specialized areas. Now, you looking at it would consider it all the same brain most likely, but if you're looking at it in systems thinking view, it's a small separate brain with a slightly different task than the rest of the brain.

If 80% of the processors in a cluster are running 'general LLM' and 20% are running 'math LLM' are they the same cluster? Could you host the cluster in a different data center? What if you want to test different math LLM modules out with the general intelligence?

alexvoda · on April 7, 2023

I think I would consider them split when the different modules are interchangeable so there is de facto an interface.

In the case of the brain, while certain functional regions are highly specialized I would not consider them "a small separate brain". Functional regions are not sub-organs.

staunton · on April 7, 2023

Significantly higher latency than you have within a single datacenter. Think "my GPU working with your GPU".

alexvoda · on April 7, 2023

There are already LLMs hosted across the internet (Folding@Home style) instead of in a single data center.

Just because the swarm infrastructure hosting an LLM has higher latency across certain paths does not make it a swarm of LLMs.

staunton · on April 7, 2023

> There are already LLMs hosted across the internet (Folding@Home style)

Interesting, I haven't heard of that. Can you name examples?

alexvoda · on April 7, 2023

I read about Petals (1) some time ago here on HN. There are surely others too, but I don't remember the names.

1. https://github.com/bigscience-workshop/petals