Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They're currently on the previous generation for Opus (3), it's kind of forgetful and has worse accuracy curve, so it can handle fewer instructions than Sonnet 3.5. Although I feel they may have cheated with Sonnet 3.5 a bit by adding a hidden temperature multiplier set to < 1, which made the model punch above its weight in accuracy, improved the lost-in-the-middle issue, and made instruction adherence much better, but also made the generation variety and multi-turn repetition way worse. (or maybe I'm entirely wrong about the cause)


Wow this is the first time i hear about such a method. Anywhere i can read up on how the temperature multiplier works and what the implications/effects are? Is it just changing the temperature based on how many tokens have already been processed (i.e. the temperature is variable over the course of a completion spanning many tokens)?


Just a fixed multiplier (say, 0.5) that makes you use half of the range. As I said I'm just speculating. But Sonnet 3.5's temperature definitely feels like it doesn't affect much. The model is overfit and that could be the cause.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: