Hacker News new | past | comments | ask | show | jobs | submit login

Great, thanks. Economics on IT h/w this side of the pond are often extra-complicated. And as a casual watcher of the space it feels like a lot of discussion and focus has turned towards, the past few months, optimising performance. So I'm happy to wait and see a bit longer.

From TFA I'd gone to look up GPTQ and AWQ, and inevitably found a reddit post [0] from a few weeks ago asking if both were now obsoleted by ELX2. (sigh - too much, too quickly) Sounds like vLLM doesn't support that yet anyway. The tuning it seems to offer is probably offset by the convenience of using TheBloke's ready-rolled GGUF's.

[0] https://www.reddit.com/r/LocalLLaMA/comments/18q5zjt/are_gpt...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: