This is a thoughtful article, but I very much disagree with the author's conclusion. (I'm biased though: I'm a co-creator of OpenHands, fka OpenDevin [1])
To be a bit hyperbolic, this is like saying all SaaS companies are just "compute wrappers", and are dead because AWS and GCP can see all their data and do all the same things.
I like to say LLMs are like engines, and we're tasked with building a car. So much goes into crafting a safe, comfortable, efficient end-user experience, and all that sits outside the core competence of companies that are great at training LLMs.
And there are 1000s of different personas, use cases, and workflows to optimize for. This is not a winner-take-all space.
Furthermore, the models themselves are commoditizing quickly. They can be easily swapped out for one another, so apps built on top of LLMs aren't ever beholden to a single model provider.
I'm super excited to have an ecosystem with thousands of LLM-powered apps. We're already starting to see it materialize, and I'm psyched to be part of it.
Seeing LLM as a motor was a legitimate view until recently. But what we're start seeing with actual agentification is models taking the driver seat, making the call about search, tool use, API. Like DeepSearch, these models are likely to be gated, not even API accessible. It will be even more striking once we'll move to industry specific training — one of the best emerging example is models for network engineering.
The key thing really about my post: it's about the strategy model providers are going to apply in the next 1-2 years. Even the title is coming from an OpenAI slide. Any wrappers will have to operate under this environment.
The only way they’d not be API accessible is surely if they contained some new and extremely difficult to replicate innovation that prevents important capabilities from being commoditised.
What reason or evidence do you see that that is (or will be) the case rather than those features simply representing a temporary lead for some models, which others will all catch up to soon enough?
Yeah, this reminds me of the breathless predictions (and despair, some corners) that flew around shortly after the initial ChatGPT launch. “Oh, they have a lead so vast, no one could ever catch up.” “[Insert X field] is dead.” Et cetera. I didn’t buy it then, and I’m not buying it now.
Of course OpenAI and Anthropic wish they could dominate the application layer. I predicted that two years ago: that model providers would see their technology commoditized, and would turn to using their customers’ data against them to lock them out with competing in-house products. But I don’t think they will succeed, for the reasons rbren mentioned previously. Good application development requires a lot of problem specific knowledge and work, and is not automatable.
On the point of RL — I predict this will generate a some more steam to keep the investment/hype machine cranking a little longer. But the vast majority of tasks are not verifiable. The vast majority have soft success criteria or are mixed, and RL will not overcome the fundamental limitations of GenAI.
I have seen this analogy before, and (hence the question). Apologies if it's rude. By my understanding, while the tools are important, most of the apps hit escape velocity as the underlying models became good enough. You had cursor doing decently well until Claude Sonnet 3.5 came along, and then it took off. As did windsurf. Perplexity and Genspark became 10x more effective with o3-mini and deepseek r1. Plus the switching costs are so low that people switch to apps with the most advanced model.(and ui is very similar to other app) Do you think there is space for apps which can keep improving without improvements to underlying models?
yeah, but engine/car analogy breaks down when it turns out all of the automotive engineering and customer driving data is fed to the engine so they can decide at any point to make your car or your car but better
> To be a bit hyperbolic, this is like saying all SaaS companies are just "compute wrappers", and are dead because AWS and GCP can see all their data and do all the same things.
isn't "we don't train on your data" one of - if not the - the primary enterprise guarantee one pays for when rolling out LLMs for engineers? i don't see a cloud analogy for that
To be a bit hyperbolic, this is like saying all SaaS companies are just "compute wrappers", and are dead because AWS and GCP can see all their data and do all the same things.
I like to say LLMs are like engines, and we're tasked with building a car. So much goes into crafting a safe, comfortable, efficient end-user experience, and all that sits outside the core competence of companies that are great at training LLMs.
And there are 1000s of different personas, use cases, and workflows to optimize for. This is not a winner-take-all space.
Furthermore, the models themselves are commoditizing quickly. They can be easily swapped out for one another, so apps built on top of LLMs aren't ever beholden to a single model provider.
I'm super excited to have an ecosystem with thousands of LLM-powered apps. We're already starting to see it materialize, and I'm psyched to be part of it.
[1] https://github.com/All-Hands-AI/OpenHands