I'm not saying how advanced R1 is—in fact, it doesn't outperform O1 by much. What really surprised me is how it was built with pure RL without SFT, which I thought was impossible before. And it makes me wonder if human thoughts are synthesizable
I hope this is just a misunderstanding stemming from my incomplete knowledge
I hope this is just a misunderstanding stemming from my incomplete knowledge