Hi, author here. An important background is the imminent rise of actual LLM agen...

evrydayhustling · 2025-03-18T14:35:05 1742308505

Nice and provocative read! Is it fair to restate the argument as follows?

- New tech (eg: RL, cheaper inference) are enabling agentic interactions that fulfill more of the application layer.

- Foundation model companies realize this and are adapting their business models by building complementary UX and witholding API access to integrated models.

- Application layer value props will be squeezed out, disappointing a big chunk of AI investors and complementary infrastructure providers

If so, any thoughts on the following?

- If agentic performance is enabled by models specialized through RL (e.g. Deep Research's o3+browsing), why won't we get open versions of these models that application providers can use?

- Incumbent application providers can put up barriers to agentic access of the data they control. How does their data incumbency and vertical specialization weigh against the relative value of agents built by model providers?

Dorialexander · 2025-03-18T15:01:09 1742310069

Hi. Yes this is wholly correct.

On the second points:

* Well I'm very much involved in making open more models, pretrained the first model on free and open data without copyrigh issues, released the first version fo GRPO that can run on Google Colab (based on Will Brown). Yet, even then I have to be realistic: open source RL has a data issue. We don't have the action sequence data nor the recipes (emulators) that could make it possible to replicate even on a very small scale what big labs are currently working on.

* Agreed on this and I'm seeing this dynamic already in a few areas. Now it's still going to be uphill as some of the data can be bought and advanced pipelines can shortcut some of the need for it, as models can be trained directly on simulated environments.

evrydayhustling · 2025-03-18T17:57:52 1742320672

Thanks for the reply - and for the open AI work!

> We don't have the action sequence data nor the recipes (emulators) that could make it possible to replicate even on a very small scale what big labs are currently working on.

Sounds like an interesting opportunity for application-layer incumbents that want to enable OSS model advancement...

ankit219 · 2025-03-18T15:42:03 1742312523

answering the first question if i understand it correctly.

The missing piece is data obviously. With search and code, it's easier to get the data so you get such specialized products. What is likely to happen is: 1/ Many large companies work with some early design partners to develop solutions. They have the data + subject matter expertise, and the design partners bring in the skill. This way we see a new wave of RL agent startups grow. My guess is that this engagement would look different compared to a typical saas engagement. Some companies might do it inhouse, some wont because maintaining such systems is a task. 2/ These companies open source part of their dataset which can be consumed by oss devs to create better agents. This is more common in tech where a path to monopoly is to commoditize the immediately previous layer. Might play out elsewhere too, though I do not have a high degree of confidence here.

mannymanman · 2025-03-18T16:51:58 1742316718

Why will application layer value props be squeezed out? And if so, where does value accrue going forward in an RL first world?

numlocked · 2025-03-18T14:58:18 1742309898

Is this the Will Brown talk you are referencing? https://www.youtube.com/watch?v=JIsgyk0Paic

haltingproblem · 2025-03-18T15:00:18 1742310018

Thanks for linking, yes that is the one he talks about on his blog also.

npodbielski · 2025-03-18T12:56:20 1742302580

Hi, interesting article.

Since I am not in the AI industry, I think I do not understand few things:

- what is RL? Research Language?

- does it mean that in essence AI companies will switch to writing enterprise software using LLMs integrated with enterprise tools?

[EDIT] Seems like you can even ask a question on HN because 'how dare you not know something?' and gonna be downvoted.

Dorialexander · 2025-03-18T13:00:59 1742302859

Hi. So quickly:

* RL is Reinforcement Learning. Already used for a while as part of RLHF but now we have started to find a very nice combo of reasoning+RL on verifiable tasks. Core idea is that models are not just good a predicting the next token but the next right answer.

* I think anything infra with already some ML bundled is especially up for grabs but this will have a more transformative impact than your usual SaaS. Network engineering is a good example: highly formalized but also highly complex. RL models could increasingly nail that.

dcow · 2025-03-18T13:43:27 1742305407

Respectfully, when you’re responding to someone who doesn't know what RL is, and you say “it’s this—already used in [another even lesser known acronym that includes the original]…” it doesn’t really help asker (like if you know what RLHF is then you know what RL is). I’ll admit I knew what RL was already but I don’t know what RLHF is and the comment just confuses me.

What is RLHF?

diggan · 2025-03-18T13:48:38 1742305718

Am I the only one who uses a search engine while reading comment threads about industries/technologies I am not familiar with? This whole conversation is like two searches away from explaining everything (or a two minute conversation with an LLM I suppose)

open_ · 2025-03-18T13:56:46 1742306206

That makes for poor communication by increasing the friction to read someone's thoughts.

As an author, you should care about reducing friction and decreasing the cost to the reader.

duggan · 2025-03-18T14:18:50 1742307530

Some onus is on the reader to educate themselves, particular on Hacker News.

mindcrime · 2025-03-18T14:06:37 1742306797

Am I the only one who uses a search engine while reading comment threads about industries/technologies I am not familiar with?

No. And yet... it's considered a Good Practice to expand acronyms on first use, and generally do things to reduce the friction for your audience to understand what you're writing.

diggan · 2025-03-18T14:09:40 1742306980

> and generally do things to reduce the friction for your audience to understand what you're writing

Sure, if you're writing a blogpost titled "Architecture for Chefs" then yes, write with that audience in mind.

But we're a mix-match of folks here, from all different walks of life. Requiring that everyone should expand all acronyms others possibly might not understand, would just be a waste of time.

If I see two cooks discussing knives with terms I don't understand, is it really their responsibility to make sure I understand it, although I'm just a passive observer, and I posses the skill to look up things myself?

ziddoap · 2025-03-18T17:17:49 1742318269

>But we're a mix-match of folks here, from all different walks of life. Requiring that everyone should expand all acronyms others possibly might not understand, would just be a waste of time.

Exactly!

Why would I waste 5 seconds of my own time, when I could waste 5 seconds of a dozen to hundreds of people's time?

My time is much better spent in meta-discussions, informing people that writing out a word one single time instead of typing up the acronym is too much.

dcow · 2025-03-18T14:08:57 1742306937

Yes, I searched RLHF and figured it out. But this was an especially “good” example of poor communication. I assume the author isn’t being deliberately obtuse and appreciates the feedback.

furyofantares · 2025-03-18T14:20:23 1742307623

This sounds impossible but I would guess RLHF is actually a better known acronym than RL. It became fairly popularly known among tech folks with no AI experience when ChatGPT came out.

npodbielski · 2025-03-18T13:38:59 1742305139

Thanks. And what about some more user focused tasks? I.e. I have small but fairly profitable company that writes specialized software for accountants. Usually it is pretty complex, tax law tends to be changed very often, there are myriads of rules, exemptions etc. Could this be solved with ML? How long till we get there it at all? How costly this would be? Disclaimer: I do not write such software. This is just an example.