Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Good observations regarding the benchmark vs. vibes in general"

Most "vibes" people are missing that it as only has 5B active parameters.

They read 120B and expect way more performance than a 24B parameter model, even though empricaly a 120B model with 5B active parameters is expected to perform right around there.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: