Llamafile is great but solves a slightly different problem very well: how do I easily download and run a single model without having any infrastructure in place first?
Ollama solves the problem of how I run many models without having to deal with many instances of infrastructure.
I can already use multiple backends by writing different code. The value-add langchain would need to prove is whether i can get better results using their abstractions compared to me doing it manually. Every time I’ve looked at how langchain’s prompts are constructed, they went wayyy against LLM vendor guidance so I have doubts.
Also the downside of not being able to easily tweak prompts based on experiments (crucial!)
And not to mention the library doesn’t actually live up to this use case, and you immediately (IME) run into “you actually can’t use a _Chain with provider _ if you want to use their _ API”, so I ultimately did have to care about whats supposed to be abstracted over
They’ve been growth hacking the whole time pretty much, optimizing for virality. Eg integrating with every ai thing under the sun, so they could publish a seo-friendly “use gpt3 with someVecDb and lang chain” page, but for every permutation you can think. Easy for them to write since langchains abstractions are just unnecessary wrappers. They’ve also had meetups since very early on. The design seems to make langchain hard to remove since you’re no longer doing functional composition like you’d do in normal python - you’re combining Chains. You can’t insert your own log statements in between their calls so you have to onboard to langsmith for observability (their saas play). Now they have a DSL with their own binary operators :[
openai cookbook! Instructor is a decent library that can help with the annoying parts without abstracting the whole api call - see it’s docs for RAG examples.