I think this is a classic case of us overestimating the immediate impact and underestimating the long term impact.
Right now, they are definitely useful time savers, but they need a lot of handholding. Eventually, someone will figure out how to get hundreds of LLMs supervising teams of millions of LLMs to do some really wild stuff that is currently completely impossible.
You could spin up a giant staff the way we do servers now. There has to be a world changing application of that.
Yes, that's called 'ensembling'. There is a lot of work being done on this kind of solution. One way in which it could work is that you can use multiple models that have been fine tuned for various problems and then use the answer that returns the highest confidence.
> We just need to apply what we learned in business school
Please don't. You've already ruined enough industries. Let the MBAs do finance and Wall Street and leave them out of the chain of command in organizations that make things.
Every time you go to the store and find that the store is still in business and there is food on the shelf, it is because someone went to business school and knows how to optimize demand estimation, pricing, and logistics.
Yes, some MBAs fuck things up. Just like some CS grads fuck things up. But advocating against the study of business is just as naive as advocating against the study of computer science just because there are some bad CS grads.
> Every time you go to the store and find that the store is still in business and there is food on the shelf, it is because someone went to business school
Are you contending that business were not successful before Wharton started pumping out MBAs?
> But advocating against the study of business is just as naive as advocating against the study of computer science
I didn't say 'don't study business', I said 'stick to finance'. MBAs tend to end up destroying innovation and productivity for short term growth and stats.
Jack Welch showed what a successfully motivated 'business oriented' leader can do to an innovative and productive legacy organization when given complete control over it. The MBAs happen to just do it on a smaller scale.
> advocating against the study of business is just as naive as advocating against the study of computer science just because there are some bad CS grads.
Criticizing garbage MBA programs is not criticizing the study of business. Business schools don't study business. They're a place where people make a lot of money selling theories about business that are useless at best and it many places, quite harmful. Learning about business by going to business school is like learning to kiss by reading books about kissing.
I would say that just as every person is unique so is every company unique. And just as there is plenty of pseudoscience plaguing psychology so are MBAs full of pseudoscience. Two fields that are far too obsessed with generalising their advice. Which is not to say that there aren't any useful ideas in these fields. But the vitriolic reaction above is warranted.
>Eventually, someone will figure out how to get hundreds of LLMs supervising teams of millions of LLMs to do some really wild stuff that is currently completely impossible.
This is an intuitive direction. In fact, it’s so intuitive that it’s a little bit odd that nobody seems to have made proper progress with LLM swarm computation.
This sounds like that old economics joke that says it's impossible to find $20 on the ground, because if it had been there, someone would have already picked it up.
Context window is a limitation, but have we actually hit the ceiling wrt scaling that? For GPT, you need O(N^2) VRAM to handle larger context sizes, but that is a "I need more hardware" problem ultimately; as I understand, the reason why they don't go higher is because of economic viability of it, not because it couldn't be done in principle. And there are many interesting hardware developments in the pipeline now that the engineers know exactly what kind of compute they can narrowly optimize for.
So, perhaps, there aren't swarms yet just because there are easier ways to scale for now?
Rather large parts of your brain are more generalized, but in particular places we have more specialized areas. Now, you looking at it would consider it all the same brain most likely, but if you're looking at it in systems thinking view, it's a small separate brain with a slightly different task than the rest of the brain.
If 80% of the processors in a cluster are running 'general LLM' and 20% are running 'math LLM' are they the same cluster? Could you host the cluster in a different data center? What if you want to test different math LLM modules out with the general intelligence?
I think I would consider them split when the different modules are interchangeable so there is de facto an interface.
In the case of the brain, while certain functional regions are highly specialized I would not consider them "a small separate brain". Functional regions are not sub-organs.
Right now, they are definitely useful time savers, but they need a lot of handholding. Eventually, someone will figure out how to get hundreds of LLMs supervising teams of millions of LLMs to do some really wild stuff that is currently completely impossible.
You could spin up a giant staff the way we do servers now. There has to be a world changing application of that.