Hacker Newsnew | past | comments | ask | show | jobs | submit | nostrowski's commentslogin

Two things I'm curious to know:

1. How many tokens can 'traditional' models (e.g. Mistral's 8x7B) fit on a single 80GB GPU? 2. How does quantization affect the single transformer layer in the stack? What are the performance/accuracy trade-offs that happen when so little of the stack depends on this bottleneck?


Mixtral 8x7b runs well (i.e., produces the correct output faster than I can read it) on a modern AMD or Intel laptop without any use of a GPU - provided that you have enough RAM and CPU cores. 32 GB of RAM and 16 hyperthreads are enough with 4-bit quantization if you don't ask too much in terms of context.

P.S. Dell Inspiron 7415 upgraded to 64 GB of RAM here.


Unfortunate, but impressed by how the Mintlify team is handling it.


Two weeks to notify customers with many first finding out on social media is impressive?


They are claming that they resolved the vulnerability that caused the token leak but don't mention it. Doesn't exactly seem transparent to me or like handling it well.

I was contracting for them last year and tried, among other things to build an actual engineering culture that prevents and fixes issues that accumulate to catastrophic incidents like this.

They generally prefer to "ship fast".

I informed them very thoroughly again on January 13th (3+ months after they terminated me for "cultural differences"), because I was worried of this exact nightmare scenario happening very soon.

The reason for this was that they open sourced a package that let's an attacker easily practice and test locally in like a minute.

MDX exposes to Cross site Scripting easily. I assume this is the "fixed vulnerability" they are talking about, just to be transparent.


Even something as simple as bolding the message about customer repositories being accessed is nice. Not trying to bury the lead.


This will be in a future history book under a chapter titled "the beginning of the end"


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: