It’s definitely not all hype, it really is a breakthrough for open source reasoning models. I don’t mean to diminish their contribution, especially since being able to read the reasoning output is a very interesting new modality (for lack of a better word) for me as a developer.
It’s just not as impressive as people make it out to be. It might be better than o1 on Python or Javascript thats all over the training data, but o1 is overwhelmingly better at anything outside the happy path.
I mean, couldn't that be because they're just overwhelmed by users at the moment?
> And the output is very bad - it mashes together the header and cpp file
That sounds way worse, and like, not something caused by being hugged to death though.
Aider recently stated DeepSeek is placed a the top of their benchmark though[1] so I'm inclined to believe it isn't all hype.
[1] https://aider.chat/docs/llms/deepseek.html