There are degrees of acceleration. My understanding, limited as it is, is that g... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		rpdillon 3 months ago \| parent \| context \| favorite \| on: GPT-OSS 120B Runs at 3000 tokens/sec on Cerebras There are degrees of acceleration. My understanding, limited as it is, is that groq and cerebras are using highly optimized acceleration to achieve their token generation rates, far beyond that in a regular GPU, and this leads to lower costs per token. Is this incorrect?

aurareturn 3 months ago [–]

Yes, they're called ASICs on Grog. But Cerebras has more general cores that can do more complex things. Inference is mostly limited by bandwidth though.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact