Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not saying how advanced R1 is—in fact, it doesn't outperform O1 by much. What really surprised me is how it was built with pure RL without SFT, which I thought was impossible before. And it makes me wonder if human thoughts are synthesizable

I hope this is just a misunderstanding stemming from my incomplete knowledge



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: