More

bestcoder69 · 2025-01-31T13:57:54 1738331874

Llamafile: https://github.com/Mozilla-Ocho/llamafile

lolinder · 2025-01-31T14:13:20 1738332800

Llamafile is great but solves a slightly different problem very well: how do I easily download and run a single model without having any infrastructure in place first?

Ollama solves the problem of how I run many models without having to deal with many instances of infrastructure.

romperstomper · 2025-01-31T15:24:05 1738337045

You don't need any infrastructure for llamafiles, you just download and run them (everywhere).

lolinder · 2025-01-31T16:56:33 1738342593

Yes, that's what I meant, sorry if it was confusing: The problem that Llamafiles solve is making it easy to set up one model without infrastructure.

homebrewer · 2025-01-31T14:49:16 1738334956

It's actually more difficult to use on linux (compared to ollama) because of the weird binfmt contortions you have to go through.

yjftsjthsd-h · 2025-01-31T16:50:18 1738342218

What contortions? None of my machines needed more than `chmod +x` for llamafile to run.

bestcoder69 · 2024-11-14T01:31:45 1731547905

permitted to force RTO without any data != ought to force RTO without any data

tomcam · 2024-11-14T01:51:26 1731549086

Why though?

bestcoder69 · on June 21, 2024

I can already use multiple backends by writing different code. The value-add langchain would need to prove is whether i can get better results using their abstractions compared to me doing it manually. Every time I’ve looked at how langchain’s prompts are constructed, they went wayyy against LLM vendor guidance so I have doubts.

Also the downside of not being able to easily tweak prompts based on experiments (crucial!)

And not to mention the library doesn’t actually live up to this use case, and you immediately (IME) run into “you actually can’t use a _Chain with provider _ if you want to use their _ API”, so I ultimately did have to care about whats supposed to be abstracted over

elbear · on June 24, 2024

Your comment gives better reasons than the article for not using LangChain.

bestcoder69 · on June 21, 2024

They’ve been growth hacking the whole time pretty much, optimizing for virality. Eg integrating with every ai thing under the sun, so they could publish a seo-friendly “use gpt3 with someVecDb and lang chain” page, but for every permutation you can think. Easy for them to write since langchains abstractions are just unnecessary wrappers. They’ve also had meetups since very early on. The design seems to make langchain hard to remove since you’re no longer doing functional composition like you’d do in normal python - you’re combining Chains. You can’t insert your own log statements in between their calls so you have to onboard to langsmith for observability (their saas play). Now they have a DSL with their own binary operators :[

VC-backed, if you couldn’t guess already

bestcoder69 · on June 21, 2024

openai cookbook! Instructor is a decent library that can help with the annoying parts without abstracting the whole api call - see it’s docs for RAG examples.

bestcoder69 · on June 21, 2024

They released chat and non-chat (completion) versions of 3.5 at the same time so not really; the switch to chat model was orthogonal.

e: actually some of the pre-chatgpt models like code-davinci may have been considered part of the 3.5 series too

bestcoder69 · on Feb 23, 2024

limewire teens were pioneers too

bestcoder69 · on Jan 31, 2024

together.xyz, although they’re .ai now

bestcoder69 · on Dec 4, 2023

Newt Gingrich proposed once that we make kids do this, but it was just gonna be for the poor kids.

bestcoder69 · on Nov 17, 2023

Tesla could just negotiate instead of being antagonistic.