This is an incredibly valuable analysis. In simple terms, performing a relativel...

This is an incredibly valuable analysis.

In simple terms, performing a relatively simple RL on various tasks is what gives the models the emergent properties like DeepSeek managed to do with multi step reasoning.

The reasoning models and DeepSearch models are essentilly of the same class, but applied on different types of tasks.

The underlying assumption then is that these "specialized" models is the next step in the industry, as the general models will get outperformed (maybe).