Does o1 need some method to allow it to generate lengthy chains of thought, or d... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

kristianp 8 months ago | parent | context | favorite | on: g1: Using Llama-3.1 70B on Groq to create o1-like ...

Does o1 need some method to allow it to generate lengthy chains of thought, or does it just do it normally after being trained to do so?

If so, I imagine o1 clones could just be fine tunes of llamas initially.

astrange 8 months ago [–]

You need an extremely large amount of training data of good CoTs. And there probably is some magic; we know LLMs aren't capable of self reflection and none of the other ones are any good at iterating to a better answer.

Example prompt for that: "give me three sentences that end in 'is'."

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact