Hacker News new | past | comments | ask | show | jobs | submit login

There's some speculation that there are higher horizons to the training, as explained in this video: https://www.youtube.com/watch?v=Nvb_4Jj5kBo

the term for it is "grokking", amusingly. There's some indication that we are actually undertraining by 10x




I've seen improvement numbers up to 12x, but after that the returns are so diminishing that there's not really a point. 12x on training costs I mean, probably still won't get AGI.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: