Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't trust this. The article cites semafor (https://www.semafor.com/article/03/24/2023/the-secret-histor...), but semafor states the 1T parameter count without any source.


Yeah seems spotty. Especially considering recent "chinchilla scaling" laws suggesting training set size is generally the current bottleneck, the mileage llama/alpaca gets out of 7b/13b, the huge inference cost of 1T, etc.


Yeah, I'm highly suspicious too. Even the arxiv article from the MS researchers doesn't have specifics about the # of parameters in GPT-4.


Sam altman said it had 1T parameters in the Lex Fridman podcast




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: