I don't trust this. The article cites semafor (https://www.semafor.com/article/0...

QuadrupleA · on March 25, 2023

Yeah seems spotty. Especially considering recent "chinchilla scaling" laws suggesting training set size is generally the current bottleneck, the mileage llama/alpaca gets out of 7b/13b, the huge inference cost of 1T, etc.

brianjking · on March 25, 2023

Yeah, I'm highly suspicious too. Even the arxiv article from the MS researchers doesn't have specifics about the # of parameters in GPT-4.

lambo4bkfast · on March 26, 2023

Sam altman said it had 1T parameters in the Lex Fridman podcast