Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Using `--flash-attn --cache-type-k q8_0 --cache-type-v q8_0`

I think you meant ‘--cache-type-v q4_0’

I would also like an explanation for what’s different in this patch compared to the standard command line arguments.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: