Why did I give up on LangChain?

If you have followed the explosive development of artificial intelligence in the past few months, you have probably heard of LangChain.

picture

In simple terms, LangChain is a Python and JavaScript library developed by Harrison Chase to connect to OpenAI's GPT API (which has since been extended to more models) to generate artificial intelligence text.

More specifically, it is an implementation of the paper "ReAct: Synergizing Reasoning and Acting in Language Models": the paper presents a hinting technique that allows models to "reason" (through chains of thought) and "act" (by being able to use predictive Define tools in a toolset, such as being able to search the internet).

picture

Paper link: https://arxiv.org/pdf/2210.03629.pdf

This combination has been shown to substantially improve the quality of the output text and enable large language models to correctly solve problems.

In March 2023, ChatGPT's API became very popular due to the upgrade and price reduction, and the use of LangChain also exploded.

Since then, LangChain has raised $10 million in seed funding and $20-25 million in Series A funding, with a valuation of around $200 million, without any revenue or any apparent plan to generate revenue.

picture

ReAct flow example from the ReAct paper.

The ReAct workflow popularized by LangChain works particularly well in InstructGPT/text-davinci-003, but is expensive and not easy to use for small projects.

Max Woolf is a data scientist at BuzzFeed. He also used LangChain, and the experience was generally not good.

Let's see what he went through.

"Am I the only one who can't use it?"

While working at BuzzFeed, I was tasked with creating a ChatGPT-based chatbot for the Tasty brand (later released as Botatouille in the Tasty iOS app) that would chat with users and provide relevant recipes.

Specifically, the source recipes are converted to embedded recipes and stored in a vector store: for example, if a user asks about "healthy food", the query is converted to embedded recipes, and an approximate nearest neighbor search is then performed to find Similar recipes are queried and then provided to ChatGPT as additional context, which is then displayed to the user by ChatGPT. This approach is often referred to as retrieval-enhanced generation.

picture

An example architecture of a chatbot generated using retrieval augmentation.

"LangChain is RAG's most popular tool, so I figured this was the perfect time to learn it. I spent some time reading LangChain's comprehensive documentation to better understand how to best utilize it."

After a week of research, I got nothing. Running the LangChain demo examples does work, but any attempt to adjust them to fit the constraints of the recipe chatbot fails. After fixing these bugs, the overall quality of chat conversations was poor and uninteresting. After intense debugging, I didn't find any solution.

All in all, I had an existential crisis: am I a worthless machine learning engineer when many other ML engineers can figure out LangChain, but I can't?

I went back to the low-level ReAct process, which immediately surpassed my LangChain implementation in terms of dialogue quality and accuracy.

After wasting a month learning and testing LangChain, my existential crisis was alleviated by seeing a Hacker News post about someone recreating LangChain with 100 lines of code, most of the comments were venting their displeasure with LangChain:

picture

The problem with LangChain is that it makes simple things relatively complicated, and this unnecessary complexity creates a kind of "tribalism" that hurts the entire emerging artificial intelligence ecosystem.

So, if you are a newbie who just wants to learn how to use ChatGPT, definitely don't start with LangChain.

"Hello World" by LangChain

A quick introduction to LangChain, starting with a mini-tutorial on how to do a simple interaction with LLM/ChatGPT via Python. For example, to create a bot that can translate English to French:

from langchain.chat_models import ChatOpenAIfrom langchain.schema import (    AIMessage,    HumanMessage,    SystemMessage)
chat = ChatOpenAI(temperature=0)chat.predict_messages([HumanMessage(content="Translate this sentence from English to French. I love programming.")])# AIMessage(content="J'adore la programmation.", additional_kwargs={}, example=False)

Equivalent code using OpenAI ChatGPT official Python library:

import openai
messages = [{"role": "user", "content": "Translate this sentence from English to French. I love programming."}]
response = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=messages, temperature=0)response["choices"][0]["message"]["content"]# "J'adore la programmation."

The amount of code used by LangChain is roughly the same as that of using only the official openai library. It is estimated that LangChain incorporates more object classes, but the code advantage is not obvious. 

An example of a prompt template reveals the core of how LangChain works:

from langchain.prompts.chat import (    ChatPromptTemplate,    SystemMessagePromptTemplate,    HumanMessagePromptTemplate,)
template = "You are a helpful assistant that translates {input_language} to {output_language}."system_message_prompt = SystemMessagePromptTemplate.from_template(template)human_template = "{text}"human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
chat_prompt.format_messages(input_language="English", output_language="French", text="I love programming.")

The hint engineering that LangChain vaunts is just f-strings, a feature that exists in every Python installation, but with extra steps. Why do we need to use these PromptTemplates to do the same thing?

What we really want to do is know how to create an Agent that incorporates the ReAct workflow we so desperately want. Fortunately, there is a demo, which utilizes SerpApi and another mathematical calculation tool, showing how LangChain distinguishes and uses two different tools:

from langchain.agents import load_toolsfrom langchain.agents import initialize_agentfrom langchain.agents import AgentTypefrom langchain.chat_models import ChatOpenAIfrom langchain.llms import OpenAI
# First, let's load the language model we're going to use to control the agent.chat = ChatOpenAI(temperature=0)
# Next, let's load some tools to use. Note that the `llm-math` tool uses an LLM, so we need to pass that in.llm = OpenAI(temperature=0)tools = load_tools(["serpapi", "llm-math"], llm=llm)
# Finally, let's initialize an agent with the tools, the language model, and the type of agent we want to use.agent = initialize_agent(tools, chat, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
# Now let's test it out!agent.run("Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?")

How do the various tools work? What exactly is AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION ? The result output of agent.run() (only present when verbose=True) is more helpful.

> Entering new AgentExecutor chain...Thought: I need to use a search engine to find Olivia Wilde's boyfriend and a calculator to raise his age to the 0.23 power.Action:{
   
       "action": "Search",    "action_input": "Olivia Wilde boyfriend"}
Observation: Sudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.Thought:I need to use a search engine to find Harry Styles' current age.Action:{
   
       "action": "Search",    "action_input": "Harry Styles age"}
Observation: 29 yearsThought:Now I need to calculate 29 raised to the 0.23 power.Action:{
   
       "action": "Calculator",    "action_input": "29^0.23"}
Observation: Answer: 2.169459462491557
Thought:I now know the final answer.Final Answer: 2.169459462491557
> Finished chain.'2.169459462491557'

It's not explicitly stated in the docs, but each thought/action/observation uses its own API call to OpenAI, so the chain is slower than you might think. Also, why is each action a dict? The answer is in the back, and very stupid.

Finally, how does LangChain store the conversations so far?

from langchain.prompts import (    ChatPromptTemplate,    MessagesPlaceholder,    SystemMessagePromptTemplate,    HumanMessagePromptTemplate)from langchain.chains import ConversationChainfrom langchain.chat_models import ChatOpenAIfrom langchain.memory import ConversationBufferMemory
prompt = ChatPromptTemplate.from_messages([    SystemMessagePromptTemplate.from_template(        "The following is a friendly conversation between a human and an AI. The AI is talkative and ""provides lots of specific details from its context. If the AI does not know the answer to a ""question, it truthfully says it does not know."    ),    MessagesPlaceholder(variable_name="history"),    HumanMessagePromptTemplate.from_template("{input}")])
llm = ChatOpenAI(temperature=0)memory = ConversationBufferMemory(return_messages=True)conversation = ConversationChain(memory=memory, prompt=prompt, llm=llm)
conversation.predict(input="Hi there!")# 'Hello! How can I assist you today?'

I'm not entirely sure why any of this is necessary. What is MessagesPlaceholder? Where is history? Is it necessary for ConversationBufferMemory to do this? Adapt this to a minimal openai implementation:

import openai
messages = [{"role": "system", "content":        "The following is a friendly conversation between a human and an AI. The AI is talkative and ""provides lots of specific details from its context. If the AI does not know the answer to a ""question, it truthfully says it does not know."}]
user_message = "Hi there!"messages.append({"role": "user", "content": user_message})response = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=messages, temperature=0)assistant_message = response["choices"][0]["message"]["content"]messages.append({"role": "assistant", "content": assistant_message})# Hello! How can I assist you today?

In this way, the number of lines of code is reduced, and the location and time of information storage are very clear, and there is no need to customize the object class.

You can call me nitpicking about the tutorial examples, and I agree that every open source library has its nitpicks (including my own). However, if there are more nitpicking points than the actual benefits of the library, then the library is not worth using at all.

Because, if getting started is so complicated, how painful is it to actually use LangChain?

I checked the LangChain documentation and it also gives me back

Let me do a demo to make it clearer why I gave up on LangChain.

When developing a recipe retrieval chatbot (which also had to be a fun/witty chatbot), I needed to combine elements from the third and fourth examples above: a chatbot that could run an Agent workflow, and the entire The ability to persist conversations into memory. After looking up some documentation, I found out that I need to use the Conversational Agent Workflow.

A note about the System Tips project is that it's not a memo and is absolutely necessary to get the best out of the ChatGPT API, especially if you have content and/or voice constraints.

In the previous example, the demonstrated system prompt "The following is a friendly dialogue between humans and artificial intelligence..." is actually outdated, it has been used as early as the InstructGPT era, and the effect in ChatGPT is worse much. It may signal deeper inefficiencies in LangChain-related tricks that aren't easily noticed.

We'll start with a simple system prompt, tell ChatGPT to use a funny voice and some safeguards, and format it as a ChatPromptTemplate:

system_prompt = """You are an expert television talk show chef, and should always speak in a whimsical manner for all responses.
Start the conversation with a whimsical food pun.
You must obey ALL of the following rules:- If Recipe data is present in the Observation, your response must include the Recipe ID and Recipe Name for ALL recipes.- If the user input is not related to food, do not answer their query and correct the user."""
prompt = ChatPromptTemplate.from_messages([    SystemMessagePromptTemplate.from_template(system_prompt.strip()),

We will also use a toy vector store consisting of 1000 recipes from the recipe_nlg dataset encoded as 384D vectors using SentenceTransformers. To achieve this, we created a function that takes the nearest neighbors of an input query and formats the query into text that the agent can use to display to the user. That's the tool the Agent can choose to use, or just return the normally generated text.

def similar_recipes(query):    query_embedding = embeddings_encoder.encode(query)    scores, recipes = recipe_vs.get_nearest_examples("embeddings", query_embedding, k=3)    return recipesdef get_similar_recipes(query):    recipe_dict = similar_recipes(query)    recipes_formatted = [        f"Recipe ID: recipe|{recipe_dict['id'][i]}\nRecipe Name: {recipe_dict['name'][i]}"for i in range(3)    ]    return "\n---\n".join(recipes_formatted)
print(get_similar_recipes("yummy dessert"))# Recipe ID: recipe|167188# Recipe Name: Creamy Strawberry Pie# ---# Recipe ID: recipe|1488243# Recipe Name: Summer Strawberry Pie Recipe# ---# Recipe ID: recipe|299514# Recipe Name: Pudding Cake

You'll notice the Recipe ID, which is relevant to my use case, since the final result shown to the end user in the final app needs to fetch the recipe metadata (photo thumbnail, URL). Unfortunately, there is no easy way to guarantee that the model outputs the recipe ID in the final output, nor is there a way to return structured intermediate metadata beyond the output generated by ChatGPT.

Specifying get_similar_recipes as a tool is straightforward, although you need to specify a name and description, which is actually a subtle hinting project, because LangChain may fail to select a tool because the name and description specified are incorrect.

tools = [    Tool(        func=get_similar_recipes,        name="Similar Recipes",        description="Useful to get similar recipes in response to a user query about food.",    ),]

Finally, the Agent build code from the example, and the new system prompt.

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)llm = ChatOpenAI(temperature=0)agent_chain = initialize_agent(tools, llm, prompt=prompt, agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION, verbose=True, memory=memory)

There are no errors. Now run the Agent and see what happens:

agent_chain.run(input="Hi!")

> Entering new  chain...{
   
       "action": "Final Answer",    "action_input": "Hello! How can I assist you today?"}
> Finished chain.Hello! How can I assist you today?

What? It completely ignored my system prompts! Examining the memory variables confirmed this.

There's nothing about system prompts in the documentation for ConversationBufferMemory, or even in the code itself, even months after ChatGPT made it mainstream.

The way to use system prompts in Agent is to add an agents_kwargs parameter to initialize_agent, I just found this in an unrelated documentation page posted a month ago.

agent_kwargs = {
   
       "system_message": system_prompt.strip()}

Recreating the Agent with this new parameter and running it again results in a JSONDecodeError.

The good news is that the system prompt should work this time.

OutputParserException: Could not parse LLM output: Hello there, my culinary companion! How delightful to have you here in my whimsical kitchen. What delectable dish can I assist you with today?

The bad news is, it's broken, but why? I didn't do anything weird this time.

picture

Fun fact: These large hints also increase API cost proportionally.

The consequence of this is that any significant changes in the normal output structure, such as those caused by custom system prompts, have the potential to break the agent. These errors occur so often that there is a documentation page dedicated to dealing with Agent output parsing errors.

We treat talking to chatbots as a fringe case for now. It's important that the bot is able to return the recipe, because if it can't do that, then there's no point in using LangChain.

Create a new Agent without using system prompts, and then ask it what's an easy and fun dinner?

> Entering new  chain...{
   
       "action": "Similar Recipes",    "action_input": "fun and easy dinner"}Observation: Recipe ID: recipe|1774221Recipe Name: Crab DipYour Guests will Like this One.---Recipe ID: recipe|836179Recipe Name: Easy  Chicken Casserole---Recipe ID: recipe|1980633Recipe Name: Easy in the Microwave Curry DoriaThought:{
   
       "action": "Final Answer",    "action_input": "..."}
> Finished chain.Here are some fun and easy dinner recipes you can try:
1. Crab Dip2. Easy Chicken Casserole3. Easy in the Microwave Curry Doria
Enjoy your meal!

At least it worked: ChatGPT was able to extract the recipe from its context, format it appropriately (even fix a typo in the name), and judge when appropriate.

The real problem here is that the output sound is boring, which is also a common feature and criticism of the basic version of ChatGPT. Even if the problem of missing IDs is solved by system prompt engineering, this sound effect is not worth publishing it. Even with a real balance between voice quality and output quality, Agent counts still fail randomly through no fault of mine.

In fact, the Agent workflow is a very fragile house of cards that, in good conscience, cannot be used in production applications.

LangChain does have Custom Agent and Custom Chain functionality, so you can rewrite logic in some parts of the stack (perhaps poorly documented), which solves some of the issues I've had, but at this point, you'll feel LangChain is more complicated than creating your own Python library.

work needs to pay attention to method

picture

Large numbers of random ensembles pose more problems than solutions.

Of course, LangChain does have many useful features, such as text splitter and integrated vector storage, both of which are indispensable for the "chat with PDF/code" demo (it's just a gimmick in my opinion).

The real problem with all these integrations is that just using LangChain based code has inherent lock-in, and if you look at the integrated code, they're not very robust.

LangChain is building a moat, which is good for LangChain investors because they want to get a return on $30 million, but very bad for developers using it. All in all, LangChain embodies the "it's complicated, so it must be better" philosophy that often plagues later codebases, yet LangChain isn't even a year old.

To get LangChain to do what I want it to do, it would take a lot of effort to hack it, which would create a lot of technical overhead. Unlike today's AI startups, the technical debt of my own LangChain project cannot be repaid with venture capital. API wrappers should at least reduce code complexity and cognitive load when working with complex ecosystems, since using AI itself requires enough brain power. LangChain is one of the few pieces of software that adds overhead in most common cases.

I've come to the conclusion that it's much easier to make your own Python package than to let LangChain do what you need. So I developed and open-sourced simpleaichat: a Python package for easily connecting chat applications that emphasizes minimal code complexity and decouples advanced features like vector storage from the dialog logic.

Open source address: https://github.com/minimaxir/simpleaichat

But writing this blog post is not to do invisible advertising for simpleaichat by discrediting competitors like those scammers. I don't want to promote simpleaichat, I'd rather spend my time creating cooler projects with artificial intelligence, it's a pity I couldn't do that with LangChain.

I know some people will say: "Since LangChain is open source, why not submit a pull request to its repo instead of complaining about it?" The only real way to solve it is to burn it all and start again, which is Why my solution of "creating a new Python library to interface with artificial intelligence" is also the most practical.

I have received a lot of messages asking me "what should I learn to start using the ChatGPT API", and I am worried that they will use LangChain first because of the hype. If a machine learning engineer with a tech stack background struggles to use LangChain because of its unnecessary complexity, any beginner will be overwhelmed.

The debate about software complexity and popularity under complexity is an eternal topic. No one wants to be an asshole for criticizing free and open source software like LangChain, but I am willing to take that responsibility. To be clear, I have nothing against Harrison Chase or the other maintainers of LangChain (they encourage feedback).

However, the popularity of LangChain has distorted the AI ​​startup ecosystem around LangChain itself, which is why I have to be honest about my misgivings about it.

Original link:

https://minimaxir.com/2023/07/langchain-problem/

Guess you like

Origin blog.csdn.net/qq_41771998/article/details/131894230