Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I got the 1T GPT-4 number from here - this is the video that goes with the Microsoft "Sparks of AGI" paper, by a Microsoft researcher that had early access to GPT-4 as part of their relationship with OpenAI.

https://www.youtube.com/watch?v=qbIk7-JPB2c



Bubeck has clarified that the "1 trillion" number he was throwing around was just a hypothetical metaphorical—it was in no way shape or form implying that GPT-4 has 1 trillion parameters [0].

[0] https://twitter.com/SebastienBubeck/status/16441515797238251...


OK - thanks!

So we're back to guessing ...

A couple of years ago Altman claimed that GPT-4 wouldn't be much bigger than GPT-3 although it would use a lot more compute.

https://news.knowledia.com/US/en/articles/sam-altman-q-and-a...

OTOH, given the massive performance gains scaling from GPT-2 to GPT-3, it's hard to imagine them not wanting to increase the parameter count at least by a factor of 2, even if they were expecting most of the performance gain to come from elsewhere (context size, number of training tokens, data quality).

So in 0.5-1T range, perhaps ?


FWIW, Stephen Gou, Manager of ML at Cohere, is currently doing a Reddit AMA, and is also guessing at 1T params for GPT-4.

https://www.reddit.com/r/IAmA/comments/12rvede/im_stephen_go...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: