LangChain and Large Language Models (LLMs) Application Basic Tutorial: Memory Components

 

If you haven't read the two blogs I wrote before, please read them first, which will help you understand this article:

LangChain and Large Language Model (LLMs) Application Basic Tutorial: Prompt Template

LangChain and Large Language Models (LLMs) Application Basic Tutorial: Information Extraction

LangChain and Large Language Models (LLMs) Application Basic Tutorial: Role Definition 

Both Chain and LLM are stateless by default, which means that they process each incoming prompt independently (as do the underlying LLM and chat models), so they do not have the ability to remember context, but in many applications In the scene, we need LLM to have the ability to remember the context, which will make our robot look more "smart", thus bringing a better user experience to users. Today we will introduce several memory components commonly used in LangChain.

First we need to install the following python packages:

pip -q install openai langchain huggingface_hub transformers

1. ConversationBufferMemory

This is the simplest memory component, and its function is to directly record the chat content between the user and the robot in memory. Let's look at an example:

from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain import OpenAI
from langchain.chains import ConversationChain
import os
#你申请的openai的api_key
os.environ['OPENAI_API_KEY'] = 'xxxxxx'

#定义llm
llm = OpenAI(model_name='text-davinci-003', 
             temperature=0, 
             max_tokens = 256)

#定义记忆力组件
memory = ConversationBufferMemory()

#定义chain
conversation = ConversationChain(
    llm=llm, 
    verbose=True, 
    memory=memory
)

Here we first define an openai language model llm, a memory component ConversationBufferMemory, and a chain. Chain is the core component in Langchain, and llm must be combined with Chain to work properly. Next, let's start our chat with the openai language model:

print(conversation.predict(input='你好,我是王老六'))

 Here we see that two parts of information are stored in the memory, the first is a prefix information: The following is ...., and then the chat content between the robot and the user. The prefix information here helps the robot to clearly define its own role , thereby limiting the robot's inability to answer the questions raised by the user as it pleases.

print(conversation.predict(input='你叫什么名字啊?'))

print(conversation.predict(input="那就叫你大聪明吧,怎么样?"))

 

print(conversation.predict(input="你还记得我叫什么名字吗?"))

 

Here we noticed that the content of the multiple rounds of dialogue between us and the robot is automatically saved in memory. In this way, the robot has a memory, and it can remember what I said before.

 2. ConversationBufferWindowMemory

The difference between the ConversationBufferWindowMemory component and the previous ConversationBufferMemory component is that it adds a window parameter, which is used to specify the number of saved multi-round conversations:

#定义openai的语言模型
llm = OpenAI(model_name='text-davinci-003', 
             temperature=0, 
             max_tokens = 256)

#定义内存组件,k=2表示只保存最近的两轮对话内容
window_memory = ConversationBufferWindowMemory(k=2)

#定义chain
conversation = ConversationChain(
    llm=llm, 
    verbose=True, 
    memory=window_memory
)

Here, the parameter k=2 of ConversationBufferWindowMemory means that only the latest 2 rounds of conversations are stored in the memory, so the earlier conversations will be discarded.

print(conversation.predict(input='你好,我是王老六'))

print(conversation.predict(input='你叫什么名字啊?'))

 

print(conversation.predict(input='那就叫你大聪明吧,怎么样?'))

 

print(conversation.predict(input='你还记得我叫什么吗?'))

 

 Here we observe that only Human-AI, Human-AI two rounds of dialogue content plus the current user's questions are kept in the memory. The previous Human-AI dialogue content will be discarded, so in the end the robot does not remember my name.

3. ConversationSummaryMemory

Unlike ConversationBufferMemory, ConversationSummaryMemory does not store all previous conversations between the user and the bot in memory. It will only store a summary of the chat content between a user and the robot. The purpose of this may be to save memory overhead and the number of tokens, because language models like openai are charged by the number of tokens, so it can save time. Province.

#定义openai的语言模型
llm = OpenAI(model_name='text-davinci-003', 
             temperature=0, 
             max_tokens = 256)
#定义内存组件
summary_memory = ConversationSummaryMemory(llm=OpenAI())
#定义Chain
conversation = ConversationChain(
    llm=llm, 
    verbose=True, 
    memory=summary_memory
)

Here we replace the memory component with ConversationSummaryMemory, which will generate summary information of the chat content in memory.

print(conversation.predict(input='你好,我是王老六'))

 

print(conversation.predict(input='你叫什么名字啊?'))

Here we see that the prefix information , the summary information of the previous rounds of conversations , and the current questions raised by the user  will be kept in memory .

print(conversation.predict(input='那就叫你大聪明吧,怎么样?'))

 

print(conversation.predict(input='你还记得我叫什么吗?'))

4. ConversationSummaryBufferMemory

ConversationSummaryBufferMemory combines the two previous working methods of ConversationBufferWindowMemory and ConversationSummaryMemory, that is, to keep a part of the summary of chat content in memory, and a part of multi-round chat content, but how many rounds of chat content to keep in memory depends on the parameter max_token_limit, if When max_token_limit is set to a smaller value, most of the previous chat content is converted into a summary, and a small part of the chat content is saved. On the contrary, when max_token_limit is set to a larger value, a small part of the previous chat content are converted into digests, while most of the chat content is preserved.

from langchain.chains.conversation.memory import ConversationSummaryBufferMemory
from langchain import OpenAI
from langchain.chains import ConversationChain
import os

os.environ['OPENAI_API_KEY'] = 'xxxxx'


llm = OpenAI(model_name='text-davinci-003', 
             temperature=0, 
             max_tokens = 256)

memory = ConversationSummaryBufferMemory(llm=OpenAI(), max_token_limit=256) 

conversation = ConversationChain(
    llm=llm, 
    memory=memory, 
    verbose=True
)

Here we set max_token_limit to 256, which means that when the prompt length of our multi-round chat content exceeds 256, the earlier chat content will be converted into a summary, and the chat content with less than 256 tokens will be retained.

print(conversation.predict(input='你好,我是王老六'))

print(conversation.predict(input='你叫什么名字啊?'))

 

print(conversation.predict(input='我给你取个名字叫大聪明吧,怎么样?'))

print(conversation.predict(input='你还记得我叫什么吗?'))

print(conversation.predict(input='你还记得我给你起的名字吗?'))

print(conversation.predict(input='我生病了,肚子疼,还呕吐发烧,你说我是自己吃点药对付一下呢还是去医院看一下比较好?'))

 From the above Duolun chat content, we can find that with the increase of the chat rounds, the amount of summary information in the memory is also increasing, and the length of the multi-round chat content in the memory is limited to 256 tokens. When there are new After the chat content enters the memory, the earlier chat content will be converted into a summary and kicked out of the Duolun chat history in the memory.

5.ConversationKGMemory

We know that the biggest problem of LLM like ChatGPT is that it will "hallucinate", that is, when llm does not know the correct answer, it will often play freely and cause llm to give a completely incorrect answer. To avoid the hallucinations of llm, Langchain provides the ConversationKGMemory component, which is the " Dialogue Knowledge Graph Memory " component, which can extract knowledge graph information from the conversation with the user, that is, some core key information, which is saved by the previous ConversationSummaryMemory component information is more concise. LLM will answer users' questions strictly based on the relevant content in the knowledge graph, so as to avoid hallucinations.

from langchain import OpenAI
from langchain.prompts.prompt import PromptTemplate
from langchain.chains import ConversationChain
from langchain.chains.conversation.memory import ConversationKGMemory

llm = OpenAI(temperature=0)


template = """下面是一段人与AI的友好对话。 人工智能很健谈,并根据其上下文提供了许多具体细节。
如果 AI 不知道问题的答案,它会如实说它不知道。 AI 仅使用“相关信息”部分中包含的信息,不会产生幻觉。

相关信息:

{history}

对话内容:
Human: {input}
AI:"""
prompt = PromptTemplate(
    input_variables=["history", "input"], template=template
)

conversation_with_kg = ConversationChain(
    llm=llm, 
    verbose=True, 
    prompt=prompt,
    memory=ConversationKGMemory(llm=llm)
)

Here we have created a prompt template, which has three parts of information: prefix information, related information, and dialogue content. The prefix information has been introduced before and will not be explained here. The relevant information refers to the knowledge map information extracted from the previous rounds of dialogue, that is, the simplified core key information, and the dialogue content only retains the current questions raised by the user.

conversation_with_kg.predict(input="你好")

conversation_with_kg.predict(input="我叫大神,我有一位朋友叫小明,他是一位宠物医院的医生。")

 

conversation_with_kg.predict(input="小明是做什么的?")

 

conversation_with_kg.predict(input="小明的爸爸也是一位宠物医生")

conversation_with_kg.predict(input="不过你小明的妈妈是位老师")

conversation_with_kg.predict(input="小明的哥哥是个老板")

conversation_with_kg.predict(input="小明的妈妈是做什么的?")

6.Entity Memory

A basic function in natural language processing (NLP) technology is: Named Entity Recognition (NER for short), also known as proper name recognition and named entity, refers to the recognition of entities with specific meaning in text, mainly including Names of people, places, institutions, proper nouns, etc., as well as texts such as time, quantity, currency, and proportional values. The Entity Memory component provided by Langchain can automatically extract entity information during the interaction between AI and humans, and store it in the memory in the form of a dictionary. The correct identification of entities can help AI to more accurately answer questions about entities raised by humans. Let's look at an example:

from langchain import OpenAI, ConversationChain
from langchain.chains.conversation.memory import ConversationEntityMemory
from langchain.chains.conversation.prompt import ENTITY_MEMORY_CONVERSATION_TEMPLATE
from pydantic import BaseModel
from typing import List, Dict, Any

Here we will use a template of ENTITY_MEMORY_CONVERSATION_TEMPLATE, which is a fixed format of memory memory. Let's take a look at the main content of this template:

## The propmpt
print(ENTITY_MEMORY_CONVERSATION_TEMPLATE.template)

 We translate it into Chinese:

 The template is basically composed of 3 parts: prefix information, context, and historical chat information. Among them, the "context" records entity information extracted from historical chat information.

conversation.predict(input="你好,我家的小狗生病了怎么办?")

 

conversation.predict(input="我突然想起来我有一个朋友叫小明,他是一位宠物医生。")

conversation.predict(input="可我听说他只会给小猫看病,不会给小狗看病,怎么办?")

conversation.predict(input="我还是去找小王吧,他是一个更专业的兽医。")

 

conversation.predict(input="请问小明会给小狗看病吗?")

 Through the above simple rounds of dialogue, we saw that the entity information in the chat records was extracted, and AI was able to accurately answer my questions based on the entity information. Let's take a look at the complete content of the entity information:

from pprint import pprint
pprint(conversation.memory.entity_store.store)

 Summarize

Today we learned 6 memory components provided by Langchain, they are:

  • ConversationBufferMemory
  • ConversationBufferWindowMemory
  • ConversationSummaryMemory
  • ConversationSummaryBufferMemory
  • ConversationKGMemory
  • Entity Memory

They have their own incompetent functions and characteristics. According to different application scenarios, we can choose different memory components. When we develop an application that interacts with AI, choosing the correct memory component can double the work efficiency of AI. AI is more accurate and natural in answering human questions without hallucinations.

References

LangChain Official Documentation

Supongo que te gusta

Origin blog.csdn.net/weixin_42608414/article/details/130152780
Recomendado
Clasificación