Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
entropicdrifter
on Jan 30, 2025
|
parent
|
context
|
favorite
| on:
An analysis of DeepSeek's R1-Zero and R1
So you mean something like, "what if the baseline, off-the-cuff response for the next-gen models was tuned based on the results of the reasoning model excluding the reasoning itself?"
spyckie2
on Jan 30, 2025
[–]
Exactly, albeit it may need the reasoning later to form the proper foundational logic in the weights.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: