Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's astounding that Mixtral Instruct ties with 3.5-turbo while being ~10x smaller.



Let's see... the linked arXiv article has been withdrawn by the author with the following comment:

> Contains inappropriately sourced conjecture of OpenAI's ChatGPT parameter count from this http URL, a citation which was omitted. The authors do not have direct knowledge or verification of this information, and relied solely on this article, which may lead to public confusion

The URL in question: https://www.forbes.com/sites/forbestechcouncil/2023/02/17/is...

This article was written by Aleks Farseev, the CEO of SoMonitor.ai, who makes the claim with no source or explanation:

> ChatGPT is not just smaller (20 billion vs. 175 billion parameters) and therefore faster than GPT-3


Hmm right, the ~300B figure may have been for the non-turbo 3.5


Are you sure it’s 10x smaller? I’d be surprised if OpenAI hasn’t been massively distilling their models.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: