It's astounding that Mixtral Instruct ties with 3.5-turbo while being ~10x small...

AdrienBrault · on Dec 19, 2023

3.5-turbo might be 20B, not 10x larger

https://www.reddit.com/r/LocalLLaMA/comments/17jrj82/new_mic...

orbital-decay · on Dec 19, 2023

Let's see... the linked arXiv article has been withdrawn by the author with the following comment:

> Contains inappropriately sourced conjecture of OpenAI's ChatGPT parameter count from this http URL, a citation which was omitted. The authors do not have direct knowledge or verification of this information, and relied solely on this article, which may lead to public confusion

The URL in question: https://www.forbes.com/sites/forbestechcouncil/2023/02/17/is...

This article was written by Aleks Farseev, the CEO of SoMonitor.ai, who makes the claim with no source or explanation:

> ChatGPT is not just smaller (20 billion vs. 175 billion parameters) and therefore faster than GPT-3

moffkalast · on Dec 19, 2023

Hmm right, the ~300B figure may have been for the non-turbo 3.5

dannyw · on Dec 19, 2023

Are you sure it’s 10x smaller? I’d be surprised if OpenAI hasn’t been massively distilling their models.