Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Apologies for the second reply, but it also occurs to me that reinforcement learning is the new battleground. Look at the changes between o1, o3 and GPT-5 thinking. Sonnet 3.7, Sonnet 4, and Sonnet 4.5. And so forth.

I expect models will get larger again once everyone is doing their inference on B200s, but the RL training budget is where the insatiable appetite sits right now.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: