Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
sergiotapia
74 days ago
|
parent
|
context
|
favorite
| on:
Deepseek R1-0528
the only reason they are fast is because the models they host are severely quantized so i've heard.
jacob019
74 days ago
|
next
[–]
Huh. I heard a podcast with the founder talking about their custom hardware, but quantization would explain it.
christianqchung
74 days ago
|
parent
|
next
[–]
Quantization alone does not explain it. It's mostly custom hardware[0].
[0]
https://groq.com/the-groq-lpu-explained/
zargon
74 days ago
|
prev
|
next
[–]
Why repeat this nonsense when it’s so trivial to just check. The reason Groq is fast is because they employ absolutely ludicrous amounts of SRAM. (Which is 10 times faster than the fastest VRAM.)
behnamoh
74 days ago
|
prev
[–]
they responded to my tweet last year and said they didn't quantize the models.
boroboro4
74 days ago
|
parent
[–]
It's very hard to find right now but I'm sure they said they don't quantize KV cache, but their weights are in fp8.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: