Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Kinda bummed, I get why he used Ollama but I feel like using llama cpp directly would provide better and more consistent results


I heard that ik_llama.cpp performs better for CPU use: https://github.com/ikawrakow/ik_llama.cpp/


As the article describes, most of this was done with llama.cpp, not Ollama.


Ahh good catch, I didn’t notice if you scroll lower, he has the llama cpp results. The ollama-benchmark repo name is a misnomer.


I'm slowly migrating all my testing to https://github.com/geerlingguy/beowulf-ai-cluster




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: