Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In theory it should make training a lot easier too, particularly on CPUs. But I think you'll still need reasonably expensive compute to get a model something close to the current big models, and you really can't ignore data. Data quality and quantity are both huge ingredients in model quality, at least as big as architecture. It's still non-trivial to get a good quality, large dataset, certainly out of the reach of lone hackers and most small companies.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: