Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi. So quickly:

* RL is Reinforcement Learning. Already used for a while as part of RLHF but now we have started to find a very nice combo of reasoning+RL on verifiable tasks. Core idea is that models are not just good a predicting the next token but the next right answer.

* I think anything infra with already some ML bundled is especially up for grabs but this will have a more transformative impact than your usual SaaS. Network engineering is a good example: highly formalized but also highly complex. RL models could increasingly nail that.




Respectfully, when you’re responding to someone who doesn't know what RL is, and you say “it’s this—already used in [another even lesser known acronym that includes the original]…” it doesn’t really help asker (like if you know what RLHF is then you know what RL is). I’ll admit I knew what RL was already but I don’t know what RLHF is and the comment just confuses me.

What is RLHF?


Am I the only one who uses a search engine while reading comment threads about industries/technologies I am not familiar with? This whole conversation is like two searches away from explaining everything (or a two minute conversation with an LLM I suppose)


That makes for poor communication by increasing the friction to read someone's thoughts.

As an author, you should care about reducing friction and decreasing the cost to the reader.


Some onus is on the reader to educate themselves, particular on Hacker News.


Am I the only one who uses a search engine while reading comment threads about industries/technologies I am not familiar with?

No. And yet... it's considered a Good Practice to expand acronyms on first use, and generally do things to reduce the friction for your audience to understand what you're writing.


> and generally do things to reduce the friction for your audience to understand what you're writing

Sure, if you're writing a blogpost titled "Architecture for Chefs" then yes, write with that audience in mind.

But we're a mix-match of folks here, from all different walks of life. Requiring that everyone should expand all acronyms others possibly might not understand, would just be a waste of time.

If I see two cooks discussing knives with terms I don't understand, is it really their responsibility to make sure I understand it, although I'm just a passive observer, and I posses the skill to look up things myself?


>But we're a mix-match of folks here, from all different walks of life. Requiring that everyone should expand all acronyms others possibly might not understand, would just be a waste of time.

Exactly!

Why would I waste 5 seconds of my own time, when I could waste 5 seconds of a dozen to hundreds of people's time?

My time is much better spent in meta-discussions, informing people that writing out a word one single time instead of typing up the acronym is too much.


Yes, I searched RLHF and figured it out. But this was an especially “good” example of poor communication. I assume the author isn’t being deliberately obtuse and appreciates the feedback.


This sounds impossible but I would guess RLHF is actually a better known acronym than RL. It became fairly popularly known among tech folks with no AI experience when ChatGPT came out.


Thanks. And what about some more user focused tasks? I.e. I have small but fairly profitable company that writes specialized software for accountants. Usually it is pretty complex, tax law tends to be changed very often, there are myriads of rules, exemptions etc. Could this be solved with ML? How long till we get there it at all? How costly this would be? Disclaimer: I do not write such software. This is just an example.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: