LangChain: The Missing Manual

chatmasta · on May 19, 2023

LangChain has moved fast and made a decent first pass at a solution to the problem of LLM orchestration. But I'm skeptical that the first solution will be the best solution, and we should keep an open mind to other approaches.

Personally, I like the more declarative approach that Microsoft is taking with guidance [0]. The two projects are not substitutable at the moment, and might even complement each other, but I'm weary of building a new ecosystem on a possibly overly-complicated first pass solution to the orchestration problem.

[0] https://github.com/microsoft/guidance

anish_m · on May 19, 2023

Resonate with the assessment. I like guidance too so far, but the dev community is far behind langchain. One big problem I have with langchain is that there are too many wrappers, and its often very confusing what little additional functionality each wrapper is adding up.

brianjking · on May 21, 2023

Yeah, there are so many platforms out there. It is incredibly tough to figure out who will be heading in the right direction.

- https://shreyar.github.io/guardrails/ - https://github.com/NVIDIA/NeMo-Guardrails - https://www.askmarvin.ai/

rahimnathwani · on May 19, 2023

s/weary/wary

chatmasta · on May 19, 2023

Damn, I'm weary of making that mistake so often. Thanks :)

behnamoh · on May 19, 2023

I personally prefer MS Guidance. Langchain is just super bloated and complicated, to the point that I found it easier to just write the code myself. Plus, I don't think Langchain is a good long-term strategy given their involvement with VCs.

bestcoder69 · on May 19, 2023

Same. I feel like once you strip away the abstractions that you probably aren't going to use long-term, you're left with something that competes with python f-strings. And yep I've tried it at multiple points in its evolution and never found it useful even in my toy apps. It was just a time sink trying to figure out why their own abstractions didn't work with each other. For weeks after they released Chat API integration, I scoured the web for working examples of using an agent with ChatGPT-3.5turbo. Closest I got was Microsoft's VisualChatGPT, until I found out it used langchain+text-davinci, despite its name :(

Langchain really seems to be entirely hype, to me. Like, nothing in production I've heard about actually uses it AFAIK. Not AutoGPT, not BabyAGI, nothing at any of the big companies, etc. But it's available in 2 languages and has integrations with everything under the sun, making it easy to adopt! Despite this lack of production usage with positive anecdotes, you're still hearing about this library a lot! Definitely doing the VC playbook.

EDIT: Go to their discord and read the #ask-kapa-langchain channel. This is a retrieval augmented Q&A bot, powered by langchain, which (every time I've checked) has helped ~nobody. I'm really not trying to cherry-pick - this is something that should be rock solid if this software stack is useful at all.

behnamoh · on May 19, 2023

Example: Say you want to use Llama models in langchain. They have TWO CONFLICTING documentations, only one of which works:

1. https://python.langchain.com/en/latest/modules/models/llms/i...

2. https://python.langchain.com/en/latest/reference/modules/llm... → Turns out, LlamaCppEmbeddings is from langchain.embeddings, not langchain.llms!

I just gave up using langchain and one of the main reasons was its terrible docs.

ShamelessC · on May 19, 2023

Didn’t they just get funded too? Sounds like more of a side project than a serious venture.

nacs · on May 19, 2023

That intro to Langchain is absolutely terrible. Like it was copy-pasted from the worst LLM they could find and pasted in:

> The first high-performance and open-source LLM called BLOOM was released. OpenAI released their next-generation text embedding model and the next generation of “GPT-3.5” models.

Just random sentences strung together delivering no overall message. Yes we know BLOOM and GPT exist, what is your point?

> LangChain appeared around the same time. Its creator, Harrison Chase, made the first commit in late October 2022. Leaving a short couple of months of development before getting caught in the LLM wave.

That's nice that the text model that wrote this knows the creator and first commit but ugh -- just say "Langchain was published in October 2022" instead of all that garbage.

Also, "Leaving a short couple of months of development before getting caught in the LLM wave." doesn't even form a complete sentence.

I'm already hating the future of blog posts and articles where we have to mentally filter out all the LLM-generated garbage around any real information.

shortj · on May 19, 2023

I caught myself the other day throwing my feed of articles into an LLM to give me summaries and what it thinks are interesting points / facts. I'm not sure how to feel about this.

contravariant · on May 19, 2023

Why wouldn't you use a language model to summarize? It is one of the most useful things a statistical language model is capable of.

Though it might be good to tune it to help it identify the parts you find interesting. Especially if you try to identify salient details.

SpaceManNabs · on May 19, 2023

is the manual itself good? guess we will have to go through it.

minimaxir · on May 19, 2023

Given the comments in the previous submission about LangChain (https://news.ycombinator.com/item?id=35820931), I am working on a much simpler/faster/cheaper alternative that doesn't require delving deep into arcane documentation and guides as if LangChain's complexity and inflexibility is a good thing for more than quick demos.

Also relatedly, I am also working on a tutorial on how to do vector similarity search without having to pay for a vector store because there is confusingly very few blog articles highlighting the space between "embeddings as text in a CSV" and "full-on vector store management" which is annoying for people who want to do personal projects.

simonw · on May 19, 2023

Looking forward to seeing that. I haven't yet managed to get my head around LangChain, excited to see what your simpler alternative looks like.

behnamoh · on May 19, 2023

Always a pleasure to read you two guys' comments on LLM posts!

minimaxir · on May 19, 2023

it’s not intentional, I promise

mkw5053 · on May 19, 2023

I used langchain to make a pretty basic LLM augmented with a (free) local vector database: https://github.com/mkwatson/chat_any_site

danmaw · on May 19, 2023

That's awesome. Looking forward to trying it.

axpy906 · on May 19, 2023

I would like to see the performance improvement wrt to something like FAISS tuned than using a vector db. Could poke a hole in a budding trend.

benpacker · on May 19, 2023

A bit odd that Pinecone is publishing this…

I can understand if they identified that building with LangChain means more Pinecone usage, and a barrier to building with Langchain is its documentation and ease of getting started, but if the now well funded project isn’t producing this itself (and in my own experience the Typescript library at least doesn’t feel like it’s hitting the nail on the head and I ended up reading source code) then I think that’s a sign we’re still searching for the best way to build complex things here

zomglings · on May 19, 2023

The fact that pinecone published this is proof of how many AI tooling products see Langchain as key to their distribution.

Personally, I find langchain unnecessary if you already know which tools you are going to use.

It remains to be seen how important portability of workloads and discoverability of tooling will be in this space. My experience with cloud computing has taught me that portability of workloads is overrated, but I'm not sure that lesson will translate well to AI models.

ren_engineer · on May 19, 2023

this is basic content marketing strategy, Pinecone can be used with LangChain and a decent chunk of people searching for tutorials about LangChain will indirectly learn about Pinecone as a result of this article

nextworddev · on May 19, 2023

Twitter retweets and seo boost . Pinecone has an extensive content strategy

wfhBrian · on May 19, 2023

LangChain is undeniably the best option for building demos.

minimaxir · on May 19, 2023

Vector stores and calculations of vector similarity are an adjacent complement to the ReAct workflow, not replacing it or being specific to LangChain.

sourcelabs · on May 21, 2023

We’re building “Langchain for Ruby” under the current working name of “Langchain.rb”: https://github.com/andreibondarev/langchainrb

People that have contributed on the project thus far each have at least a decade of experience programming in Ruby. We’re trying our best to build an abstraction layer on top all of the common emerging AI/ML techniques, tools, and providers. We’re also focusbig on building an excellent developer experience that Ruby developers love and have gotten to expect.

Unlike the Python project, as it’s been pointed out here a countless number of times, we’d like to avoid deeply nested class structures that make it incredibly difficult to track and extend.

We’ve been pondering over the “what does Rails for Machine Learning look like?” question, and we’re taking a stab at answering this question.

We’re hyper-focused on the open source community and the developer community at large. All feedback/ideas/contributions/criticism are welcome and encouraged!

sthatipamala · on May 19, 2023

I don't get why Langchain is so popular. It's classic inner-platform effect: "a system so customizable as to become a replica, and often a poor replica, of the software development platform they are using."

Once we get better at getting structured output from LLMs, we can use the standard control flow mechanisms in built in to our programming languages.

behnamoh · on May 19, 2023

Much of the popularity of langchain, AutoGPT, etc. is just due to hype and appeals to no-coders. Anyone with some knowledge of programming and LLMs sees the facade and avoids these things.

shmoogy · on May 19, 2023

Thank you for posting this. I've started out some MVPs using vector DBs (testing pinecone and supabase with pgvector) and finding a lot of things that are not obvious to me.

wskish · on May 19, 2023

For anyone thinking about applications of langchain and pinecone but who are looking for something more turn-key check out https://jiggy.ai

The core is actually open source as well, allowing you to take your data back out via sqlite and hnswlib (https://github.com/jiggy-ai/hnsqlite)

nextworddev · on May 19, 2023

Hey I’m actually interested. Could you clarify what we mean by more “turnkey”?

wskish · on May 20, 2023

In this case we mean you can get some of the benefits of langchain and pinecone such as semantic search and augmented GPT retrieval without needing to deal with vectors, chunking, and llm tooling at such a low level. You can upload docs and then begin chatting against them immediately. JiggyBase is just a higher level abstraction on top of these same type of components which may be useful in a lot of cases where you don't need full control over the vector embeddings and such and just want to interact with your data.

revskill · on May 19, 2023

DO we need a LangChain alternative ? Nowadays i evaluate an open source project via documentation.

My expectation for a good documentation, is not much about "Reference". It's more about accessile content to get started, then go to advanced things.

Documentation is not the same as just a reference.

A bad documentation feels like: You're not welcome to contribute (yet).

yewenjie · on May 19, 2023

A lot of people are dissing LangChain at this thread for various reasons. I am primarily interested in building tools that couple LLMs with other things like web browsing and using Wolfram or Zapier APIs. Is there a LangChain alternative for that?

ajhai · on May 19, 2023

We allow users to build LLM chains at trypromptly.com as apps. Once an app is built, it can be integrated into other applications as iframe embeds or can be called via our APIs so custom frontends can be built.

We also have Zapier integration (https://zapier.com/apps/promptly/integrations) so apps on Promptly can be invoked from zaps

ajhai · on May 19, 2023

We have just added support for ElevenLabs. https://twitter.com/ajhai/status/1659642782607372288 is a quick demo of the platform if interested.

bestcoder69 · on May 19, 2023

ChatGPT plugins will be usable through the OpenAI API, and both of those services are available as ChatGPT plugins.

rahimnathwani · on May 19, 2023

Mark Watson's guide to LangChain is free to read online: https://leanpub.com/langchain/read

bestcoder69 · on May 19, 2023

Has langchain had positive ROI for anyone beyond the initial prototype of their app? My experience has me skeptical - I end up feeling like I'm painted into a corner and need to start over. Maybe if I just used PromptTemplate and LLMChain, but at that point I can just use function composition and formatted strings.

Like I'd be blown away if someone had a production app where they were able to swap LLM providers (and nothing else) due to langchain. And if that expectation is too high then why not just code against the openai API?

behnamoh · on May 19, 2023

I'd argue that building your app based on a quickly evolving library is not a wise idea, esp. if the library is not well documented.

axpy906 · on May 19, 2023

Outside a prototype what’s the benefit? The steps I envision are: 1) turn prompt to embeddings 2) return examples that match from vector db 3) load as examples 4) prompt LLM with examples. Am I missing something? Why on earth would you want to import multiple dependencies for this?

hdd2k · on May 19, 2023

+1 on this experience. Langchain is great at wrapping simpler tasks, but once you start to decouple components you start to run into issues.