Code is also hard. You got to generate code that accounts for all possible exceptions or errors. If you want to automate an UI for example, pushing a button can cause all sorts of feedback, errors, consequences that need to be known to write the code.
There's no Taiwanese silicon industrial complex, there's TSMC. The rest of Taiwanese fabs are irrelevant. Intel is the clear #3 (and looks likely-ish to overtake Samsung? We'll see).
glad I held out. M4 is going to put downward pressure across all previous gen.
edit: nvm, AMD is coming out with twice the performance of M4 in two months or less. If the M2s become super cheap I will consider it but M4 came far too late. There's just way better alternatives now and very soon.
> AMD is coming out with twice the performance of M4 in two months or less
M4 Pro/Max/Ultra with variants double+ the performance by just scaling cores are probably also going to be announced at WWDC in a month when they also announce their AI roadmap
this is quite worrying for OpenAI as the rate token prices have been plummeting thanks to Meta and its going to have to keep cutting its prices while capex remains flat. whatever Sam says in interviews just think the opposite and the whole picture comes together.
It's almost a mathematical certainty that people who invested in OpenAI will need to reincarnate in multiple universes to ever see that money again but no bother many are probably NVIDIA stock holders to even out the damage.
Well, Pareto is about optimisation, not either/or. I want a model that’s smart enough, while also being locally-executable.
I don’t know whether/when we’ll get there, and whether it will be improvements in models, or underlying model technology, or GPU/TPUs with larger memory at a consumer price point, or something else, that will deliver it.
That's just the middle of the Pareto frontier. Some people want one corner, others the other corner. It's like compression. You can have light compression and high speed, vice-versa, or a balance. You want balance. I want one of the corners.
I agree with you somewhat. You are correct unless they have a much better GPT model that have not released for whatever reason. They are a year ahead than competitors and GPT4 is pretty old now. I find it hard to believe they don’t have much more capable models now. We Will see though
The polish of OpenAI stuff when released has been quite mature since gpt4 or even 3.5.
They are no doubt sitting on ultra polished stuff. When you are the tip of the arrow though and the cutting edge itself it might not be as efficient but does it ever show you things you can’t unsee.
When OpenAI can launch a video thing a day after because it’s ready to go. I am less and less skeptical e dry time they ship because the quality of the first version isn’t sliding back wards even in different areas like video.
Maybe releasing it is strategic, or releasing it also requires supporting it infrastructure wise and then some. That might be a challenge.
My feeling is the next model of an k between may have massive efficiency and performance improvements without having to go quantum with brute forcing it.
Meanwhile others who are following what OpenAI has done seem to be able to optimize it and make it more efficient whether it’s open source or otherwise.
Both are doing important work and I'm not sure I want to see it as a one winner take all game.
The way AI vendors are responding suddenly to another’s launch feels like they are always ready to launch and continue to add functionality to it that could also ship.
It reminds me of when Google spent a billion dollars advertising bing had a billion pages indexed. Google stayed quiet. Then when the money was spent by Microsoft, Google simply added a zero or two to their search page, when they used to list how many pages they have indexed. They were just sitting on it already done, announcing it when it’s to their benefit.
Also, what will the effect of open models be on the LLM provider industry? What effect will Meta’s scorched earth policy of killing markets by releasing very good open models have?
I use LLMs constantly, but no longer in a commercial environment (I am retired except for writing books, performing personal research projects, and small consulting tasks). I now usually turn first to local models for most things: ellama+Emacs is good enough for me to mostly stop using GPT-4+Emacs or GitHub Copilot, the latest open 7B, 8B, 30B models running on my Mac seem sufficient for most of the NLP and data manipulating things I do.
However, it is also fantastic to have long context Gemini, OpenAI APIs, Claude, etc. available when needed or just to experiment with.
GPT-4 is not a single model. The GPT-4 that was released initially a year ago is way worse in benchmarks than the newest versions of it and the original version has been beat by quite a lot of other models by this point.
The newest version of GPT-4 is probably still overall the best model currently, but it is only a few months old, and the picture depends a lot on what benchmarks you are looking at.
E.g. for what we are doing at our company (document processing, etc.) Claude-3 Opus and Gemini-1.5 Pro are currently the better models. The newest GPT-4 even performed worse than a previous version.
So to me it def. seems like the gap is getting smaller. Of course, OpenAI could be coming out with GPT-5 next week and it could be vastly better than all other current models.
There's wide speculation that what will be branded as either GPT-4.5 or GPT-5 has finished pretraining now and is undergoing internal testing for a fairly near-term release.
My speculation is that internally they have much stronger models like Q* but they won’t be able to release them to public even if they want to for lack of compute and safety and other reasons they see probably…
> My speculation is that internally they have much stronger models like Q*
People used to speculate the same about Google. Everyone hypes up their “secret, too powerful to release” models. Remember the dude who was convinced that there was a sentient AI in the machine? The light of actual public release tends to expose a lot of the hype.
That would be a reasonable assumption if OpenAI did not already have an established track record of repeatedly re-defining our fundamental expectations of what technology can do.
GPT-4 was already completed and secretly being tested on Bing users in India in mid-2022 (there were even Microsoft forum posts asking about the funny chatbot). Even after heavy quantization and the alignment tax GPT-4 is still the bar to beat. It's been two years and their funding has increased over 10x since then.
Short of a fundamental Hard Problem that they cannot overcome, their internal bleeding edge models can reasonably be assumed to possess significantly greater capabilities.
Honestly I'm pretty puzzled by this mystical fog that hangs over OpenAIs skunkworks projects - don't people leave for other jobs/go to conferences etc.?
I'm surprised that nobody call tell what they infact do or do not have.
With hundreds of billions on the line for the founders and a whole lot of likely unvested stock options for the employees, it doesnt seem like anyone wants to open up about whats actually going on day to day.
I'm not saying Claude 3 and Gemini are better than GPT4 in every aspect, but those two models can at least perform addition on arbitrarily long numbers, meanwhile GPT4 struggles.
Everyone could say anything about open source models, but they're comparing themselves to what OpenAI released a year ago. They haven't shown all of their cards yet and they have a decent moat already in place; some say they have no moat, I disagree, they have one of the best moats possible which is brand awareness.
Sora on its own could bring in billions in revenue; an open-source Sora will take at least another year, if not two, to come out. Then more time until it can run on commodity hardware. An open source model that only runs in a dedicated H100 is actually less useful than a closed model behind an API call; not to detract from open source, I think it's the way to go but I'm just being pragmatic and realistic. There's a reason why MS Office is still the top productivity app in the world, even though dozens of open source alternatives exist.
> they have one of the best moats possible which is brand awareness.
Do they though?
If you talk to "regular people", everybody knows ChatGPT, but nobody knows or cares about OpenAI. And most of them don‘t even really know that name. They call it ChatUuuuhm, ChatThingy, Chad Gippity, or similar.
I think they will just switch, when something better comes along.
MS had yet to fully stabilize that lead a full decade after they had won the os platform standard for ibm compatible pcs. A platform standard moat goes way way beyond a brand advantage.
Azure, while significant, has no similar monopoly to support OpenAI. Do you really see a structural advantage to openAI beyond the Microsoft products integrating it?
a) A year after GPT-4 set the bar, it's still the best model, despite everyone else not having to do it first. Just copy, and just software. And that's not for lack of trying by every other viable prime player on the planet with unprecedented acceleration.
Imagine any other piece of software, where the incumbent has a mere 2-3 year head start, in which they had to work out the entire product that everyone else, despite just having to copy and pressing the pedal through the floor is struggling just trying to catch up with.
b) The current models including GPT-4 are so bad. The few billions can be made by just by continue playing this game of improvements for a few years and getting better each year. I think people are wildly confused about how big this market is going to be when that happens. They are not squeezing hosting or compute. They are squeezing intelligence. Intelligence is the entire economy. The notion that there would ever not be room for multiple things here, maybe through size or specialisation or cost (as with all other intelligence), and that a few billion dollar are a big deal, is so strange to me.
c) The game will at some point, be mostly about infra and optimization. People come to the conclusion that's a problem for the incumbents, when our entire industry is mostly about infra and optimization. AWS is infra and optimization. I think even the average hn tinkerer understands that therein lies a proposition that's not exactly equivalent to "just rent a few servers and do it yourself".
> A year after GPT-4 set the bar, it's still the best model
Debatable. Many people find Claude Opus superior, and I know I've found it consistently better for challenging coding questions. More importantly, the delta between GPT-4 and everything else is getting smaller and smaller. Llama 3 is basically interchangeable with GPT-4 for a huge number of tasks, despite its smaller size.
Many more do not, according to the LMSYS leaderboard.
> Llama 3 is basically interchangeable with GPT-4 for a huge number of tasks
Sure. I am sure the number approaches infinity, if you are willing to let the model inform the task. That's usually not what most people are looking for in a tool.
While they still call it GPT-4, the one topping the rankings are newer iterations of it despite still retaining the same name. The latest one is from 2024-04-09. Sure that one probably finished training a few months ago but it is by no means a 2 year head start.
Even more important, you know that GPT-4 will probably also not crack it. Which is why the SOTA is not terribly interesting. The delta between GPT-4 and the competition has been closing but why anyone would assume that this is a trend and that it would continue with GPT-4.5 to competition, or GPT-5 to competition instead of the other way around is a mystery to me.
I am not saying it could not be true. But extrapolating from differences between current bad models to a future with better models is weird, specially when everyone seems to pretty much agree that scale is the difference between the two and scale is hard and exclusive.
There’s a scatterplot that’s been circulating on Twitter. The trend lines show that since the time of GPT-2, open weights models have improved at a steeper rate than proprietary models, with the two on a path to intersect.
I would argue that's to be expected after the first generally accepted POC (GPT-3.5) was released, with it an entire industry created, and other companies actually started copying/competing in a big way.
It seems a stretch to read this as a continuing trend, when (from what I gather everyone agrees on) the way to better models seems to be ever more efficient handling of ever larger amounts of money, compute and data, with no reasonable limits in sight on any of the three.
Scaling up LLMs is only going to go so far, and it will yield diminishing marginal returns on all of that money, compute, and data. It’s a regime of exponential increases in inputs for linear gains in the outputs - barring some technological breakthroughs which could come from anywhere, not just from OpenAI.
Nothing wrong with gambling (i.e. wagering real resources against an uncertain outcome), the entire financial market / all of capitalism is a chaotic orchestra of millions of gambles.
the statistics are not in its favour