Source?

nell · 2024-09-16T07:51:27 1726473087

> I wouldn't call o1 a "system". It's a model, but unlike previous models, it's trained to generate a very long chain of thought before returning a final answer

https://x.com/polynoamial/status/1834641202215297487

astrange · 2024-09-16T08:40:43 1726476043

That answer seems to conflict with "in the future we'd like to give users more control over the thinking time".

I've gotten mini to think harder by asking it to, but it didn't make a better answer. Though now I've run out of usage limits for both of them so can't try any more…

qeternity · 2024-09-16T10:36:05 1726482965

I'm not convinced there isn't more going on behind the scenes but influencing test-time compute via prompt is a pretty universal capability.

whimsicalism · 2024-09-16T16:35:07 1726504507

not in a way that it is effectively used - in real life all of the papers using CoT compare against a weak baseline and the benefits level off extremely quickly.

nobody except for recent deepmind research has shown test time scaling like o1

bratwurst3000 · 2024-09-16T12:15:14 1726488914

i am telling claude to give me not the obvious answer. that put thinking time up and the quality of answers is better. hope it helps.