Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Ollama is definitely the easiest way to run LLMs locally

Nitro outstripped them, 3 MB executable with OpenAI HTTP server and persistent model load



Persistent model loading will be possible with: https://github.com/ollama/ollama/pull/2146 – sorry it isn't yet! More to come on filesize and API improvements


I just wanted to say thank you for being communicative and approachable and nice.


Who cares about executable size when the models are measured in gigabytes lol. I would prefer a Go/Node/Python/etc server for a HTTP service even at 10x the size over some guy's bespoke c++ any day of the week. Also, measuring the size of an executable after zipping is a nonsense benchmark in of itself


Not some guy, agree on zip, disagree entirely with tone of the comment (what exactly separates ollama from those same exact hyperbolic descriptions?)




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: