Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Neat. It would be nice to provide an option to use an API endpoint without downloading an additional local model. I have several models downloaded via ollama and would prefer to use them without additional space being taken up by the default model.



From the README:

Optionally, offload generation to speed up generation while extending the battery life of your MacBook.

Screenshot shows example, mentions OpenAI and gpt-4o.


But it still forces you to download a local model before you can use that feature.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: