Hacker News new | past | comments | ask | show | jobs | submit login

That is very accurate with what we have found. <thinking> models do a lot better, but with huge speed drops. For now, we have chosen accuracy over speed. But speed drop is like 3-4x - so we might move to an architecture where we 'think' only sporadically.

Everything happening in the LLM space is so close to how humans think naturally.






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: