Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s mostly Moores law and scaling. Language model progress will be the next Moores law until diminishing returns are reached.


A recent paper shows that empowering a language model to search a text corpus (or the internet) for additional information could improve model efficiency by 25 times [1]. So you only need a small model because you can consult the text to get trivia.

That's 25x in one go. Maybe we have the chance to run GPT-4 ourselves and not need 20 GPU cards and a 1mil $ computer.

[1] https://deepmind.com/research/publications/2021/improving-la...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: