Hacker News new | past | comments | ask | show | jobs | submit login

I should have qualified the meaning of “works perfectly” :) No 70b for me, but I am able to experiment with many quantized models (and I am using a Llama successfully, latency isn’t terrible)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: