Hello everyone, I am Student Ambassador Jambo. In the previous series, we introduced the use of Azure OpenAI API. If you have followed the tutorial, you should be able to feel that it is very simple to just call the API, and the tedious part is how to combine the API with your application. Next, I will introduce a library called LangChain, which can help you more easily integrate Azure OpenAI into your application.
I will also make this into a series, with the ultimate goal of implementing a chatbot that can answer questions based on a database.
Why use LangChain
Many developers want to incorporate large language models like GPT into their applications. And these applications do not simply pass user input to GPT, and then return GPT output to the user.
These applications may need to answer questions based on specific sources of data, and therefore need to consider how the data is stored and found. Or need to tidy up the user's input, keep a record of previous messages and extract the highlights. If you want the model to output text in a specific format, you need to describe the format in detail in the prompt (prompt), and even provide examples. These prompts are usually managed in the background of the application, and users often don't notice their existence. For some complex applications, a problem may require multiple actions to be performed. For example, AutoGPT, which claims to be able to automatically complete specified items, actually generates the required action according to the target and the prompt written by the author and outputs it in JSON format, and then the program executes the corresponding action.
LangChain has basically packaged these functions that you may use, and you only need to plan the program logic and call the function. In addition, these functions of LangChain have nothing to do with the specific model API used. It is not necessary to write different codes for different language models, just change the API.
basic usage
Before using LangChain, it is recommended to understand the calls of Azure OpenAI API, otherwise even using LangChain, the parameters and usage may not be easy to understand. For details, please refer to my previous series of tutorials: Calling Azure OpenAi API with Python
LangChain calls the language model continuation (completion) of the text as llm, and the language model with a chat interface (input as chat records) is called a chat model. Next, we will also use the Azure OpenAI API as an example.
Install
Because LangChain will actually use the SDK provided by OpenAI when calling OpenAI's API, so we also need to install it together openai
.
pip install langchain
pip install openai
generate text
Instantiate the model object
Before using the API, we need to set environment variables first. If you are using the native interface of OpenAI, you only need to set it api_key
; if you are using the Azure OpenAI API, you also need to set api_version
and api_base
, the specific value is the same as using openai
the library to call the Azure API, you can refer to my previous tutorial: Call Azure OpenAi API with Python
import os
os.environ["OPENAI_API_KEY"] = ""
os.environ["OPENAI_API_VERSION"] = ""
os.environ["OPENAI_API_BASE"] = ""
Of course, these values can also be set using commands in the terminal export
(under Linux), or set in the .env file, and then imported into the environment variables with python-dotenv
the library .
The classes of LangChain's large language model (llm) are encapsulated llms
in , we need to import AzureOpenAI
the class from it and set related parameters. The parameter name of the specified model is deployment_name
, and the remaining parameters are the parameters of the OpenAI API. In fact, the API information set in the environment variable above can also be passed in as a parameter here, but considering the convenience and security, it is still recommended to set the API information in the environment variable.
It should be noted that the prompt and stop parameters are not passed in here (stop can but will report a warning), but are passed in when the text is generated below.
from langchain.llms import AzureOpenAI
llm = AzureOpenAI(
deployment_name="text-davinci-003",
temperature=0.9,
max_tokens=265,
)
Also, if you are using the native OpenAI API, the imported class should be OpenAI
, and the parameter name of the specified model is model_name
, for example:
from langchain.llms import AzureOpenAI
llm = AzureOpenAI(model_name="text-davinci-003")
Serialized LLM configuration
If you need to have different llm configurations for multiple scenarios, it will not be so simple and flexible to write the configuration in the code. In this case, it is obviously more convenient to save the llm configuration in a file.
from langchain.llms import OpenAI
from langchain.llms.loading import load_llm
LangChain supports reading or saving llm configuration in json or yaml format. Suppose I now have a llm.json
file with the following contents:
{
"model_name": "text-davinci-003",
"temperature": 0.7,
"max_tokens": 256,
"top_p": 1.0,
"frequency_penalty": 0.0,
"presence_penalty": 0.0,
"n": 1,
"best_of": 1,
"request_timeout": null,
"_type": "openai"
}
Then we can use load_llm
the function to convert it into an llm object, and the specific language model used is _type
defined by the parameter.
llm = load_llm("llm.json")
# llm = load_llm("llm.yaml")
Of course, you can also export configuration files from llm objects.
llm.save("llm.json")
llm.save("llm.yaml")
generate text from text
Next we will use the model object instantiated above to generate text. LangChain's llm class has three methods to generate text from String: predict()
method, generate()
method, call object directly like a function ( __call__
).
It seems that there are many ways, but in fact they are all just generate()
one . Specifically, perdict()
called after a simple check __call__
and called after __call__
a simple check generate()
. generate()
The biggest difference between the method and the other two methods lies in the input and return content of the prompt: the generate()
input is a list containing the prompt and returns an LLMResult
object , while the other two methods input the string of the prompt itself and return the generated text string. It means generate()
that the corresponding text can be independently generated for multiple prompts at a time.
prompt = "1 + 1 = "
stop = ["\n"]
# 下面三种生成方法是等价的
res1 = llm(prompt, stop=stop)
res2 = llm.predict(prompt, stop=stop)
res3 = llm.generate([prompt], stop=stop).generations[0][0].text
If you just want to simply continue to write (generate) text from the text, it is recommended to use predict()
the method , because it is the most convenient and intuitive.
chat model
Instantiate the model object
As with the generated model above, we need to set the environment variable first
import os
os.environ["OPENAI_API_KEY"] = ""
os.environ["OPENAI_API_VERSION"] = ""
os.environ["OPENAI_API_BASE"] = ""
LangChain's chat model is packaged langchain.chat_models
under , we also use Azure OpenAI for demonstration here, importing AzureChatOpenAI
the class.
If you have read my previous tutorial on calling the API directly, it should be clear that the prompt we input to the chat model is no longer a text, but a message record. The message record is the content of the conversation between the user and the model in turn. These messages are wrapped by LangChain are AIMessage
, HumanMessage
, , respectively corresponding to , , SystemMessage
in the original API .assistant
user
system
from langchain.chat_models import AzureChatOpenAI
from langchain.schema import (
AIMessage,
HumanMessage,
SystemMessage
)
chat = AzureChatOpenAI(deployment_name="gpt-35-turbo", temperature=0)
We first build an initial message record, of course, SystemMessage
it is not necessary. The chat model generates messages by directly calling the object, and it returns an AIMessage
object .
messages = [
SystemMessage(content="你是一名翻译员,将中文翻译成英文"),
HumanMessage(content="你好世界")
]
chat(messages)
AIMessage(content='Hello world.', additional_kwargs={}, example=False)
As before, generate()
the method also supports generating messages for multiple chat records, but the return value here is an LLMResult
object.
chat.generate([messages, messages])
LLMResult
It is said above that generate()
the method returns an LLMResult
object, which consists of three parts: generations
storing the generated content and corresponding information, llm_output
storing the token usage and the model used, and run
storing the unique one run_id
, which is for the convenience of calling during the generation process Callback. Usually I only need to focus on generations
and llm_output
.
In order to show the results LLMResult
of , here we recreate an llm object and set parameters n=2
, which means that the model will generate two results for each prompt, and the default value is n=1
.
llm = AzureOpenAI(deployment_name="text-davinci-003", temperature=0, n=2)
llm_result = llm.generate([f"{
i}**2 =" for i in range(1, 11)], stop="\n")
print(len(llm_result.generations))
# -> 10
print(len(llm_result.generations[0]))
# -> 2
Since LLMResult
is inherited from Pydantic BaseModel
, it can json()
be formatted as JSON with:
print(llm_result.json())
{
"generations": [
[
{
"text": " 1",
"generation_info": {
"finish_reason": "stop",
"logprobs": null
}
},
{
"text": " 1",
"generation_info": {
"finish_reason": "stop",
"logprobs": null
}
}
],
...
],
"llm_output": {
"token_usage": {
"prompt_tokens": 40,
"total_tokens": 58,
"completion_tokens": 18
},
"model_name": "text-davinci-003"
},
"run": {
"run_id": "cf7fefb2-2e44-474d-918f-b8695a514646"
}
}
It can be seen generations
that is a two-dimensional array, each element of the first dimension represents the result generated by the corresponding prompt, and each element of the second dimension represents the result of a generation of this prompt. Because we set it n=2
, each prompt will generate Results twice.
For the generated (completion) model, the generated result will be a dictionary, and the specific generated content is in text
the field . For the chat model, it will wrap the result into ChatGeneration
an object , where text
the field is the generated text, message
and the field is AIMessage
the object , for example:
LLMResult(generations=[
[
ChatGeneration(
text='Hello world.',
generation_info=None,
message=AIMessage(
content='Hello world.',
additional_kwargs={},
example=False)
)
]
],
llm_output={
'token_usage': {
'completion_tokens': 6, 'prompt_tokens': 80, 'total_tokens': 86
},
'model_name': 'gpt-3.5-turbo'
},
run=RunInfo(run_id=UUID('fffa5a38-c738-4eef-bdc4-0071511d1422')))
Prompt template
In many cases, we will not directly throw the user's input to the model, and may need to supplement the information in the context, and this supplementary information is the "template". Here is a simple example where the prompt contains an input variable product
:
template = """
我希望你担任顾问,帮忙为公司想名字。
这个公司生产{product},有什么好名字?
"""
We PromptTemplate
can wrap this prompt with input variables into a template with .
from langchain import PromptTemplate
prompt_template = PromptTemplate(
input_variables=["product"],
template=template,
)
prompt_template.format(product="运动衬衫")
# -> 我希望你担任顾问,帮忙为公司想名字。
# -> 这个公司生成运动衬衫,有什么好名字?
Of course, if there is no input variable in the prompt, it can also be PromptTemplate
wrapped with , but input_variables
the parameter input is an empty list.
If you don't want to specify it manually input_variables
, you can also use from_template()
the method to automatically deduce it.
prompt_template = PromptTemplate.from_template(template)
prompt_template.input_variables
# -> ['product']
You can also save the template to a local file. Currently LangChain only supports json and yaml formats, and it can automatically determine the file format through the file suffix.
prompt_template.save("awesome_prompt.json") # 保存为 json
can also be read from a file
from langchain.prompts import load_prompt
prompt_template = load_prompt("prompt.json")
Chain
Chain is a very important concept in LangChain (after all, it is put in the name). It is similar to pipeline (Pipeline), which is to assemble multiple operations into a function (pipeline), so that the code is more concise and convenient.
For example, if we execute a complete task cycle, we need to generate a prompt first, give the prompt to llm to generate text, and then may need to perform other processing on the generated text. Going a step further, we may also need to log each stage of the task, or update other data. If these operations are all written out, the code will be very lengthy and not easy to reuse, but if you use Chain, you can package all these tasks, and the code logic is clearer. You can also combine multiple Chains into a more complex Chain.
We first create an llm object and a prompt template.
from langchain.llms import AzureOpenAI
from langchain import PromptTemplate
chat = AzureChatOpenAI(deployment_name="gpt-35-turbo", temperature=0)
prompt = PromptTemplate(
input_variables=["input"],
template="""
将给定的字符串进行大小写转换。
例如:
输入: ABCdef
输出: abcDEF
输入: AbcDeF
输出: aBCdEf
输入: {input}
输出:
""",
)
Next, we can LLMChain
combine llm and prompt into a Chain. This Chain can accept user input, then fill the input into the prompt, and finally pass the prompt to llm to generate the result. In addition, if you are using the chat model, then the Chain used is ConversationChain
.
from langchain.chains import LLMChain
chain = LLMChain(llm=chat, prompt=prompt)
print(chain.run("HeLLo"))
# -> hEllO
If there are multiple input variables in the prompt, a dictionary can be used to pass them all at once.
print(chain.run({
"input": "HeLLo"}))
# -> hEllO
Debug mode
The above are some simple examples, involving only a small number of input variables, but in actual use there may be a large number of input variables, and the output of llm is still not fixed, which makes it difficult for us to reflect from the heaviest result. Push the problem. To solve this problem, LangChain provides verbose
a mode , which will print out the input and output of each stage, so that you can easily find the problem.
chain_verbose = LLMChain(llm=llm, prompt=prompt, verbose=True)
print(chain_verbose.run({
"input": "HeLLo"}))
> Entering new chain...
Prompt after formatting:
将给定的字符串进行大小写转换。
例如:
输入: ABCdef
输出: abcDEF
输入: AbcDeF
输出: aBCdEf
输入: HeLLo
输出:
> Finished chain.
hEllO
Combination Chain
A Chain object can only complete a very simple task, but we can combine multiple simple actions like building blocks to complete more complex tasks. The simplest of them is the sequential chain SequentialChain
, which connects multiple chains in series, and the output of the previous chain will be used as the input of the next chain. However, it should be noted that because this is just the simplest chain, it will not perform any processing on the input or output, so you need to ensure that the input and output of each Chain are compatible, and It requires that each Chain's prompt has only one input variable.
Below, we first square the input number and then convert the squared number to a Roman numeral.
from langchain.chains import SimpleSequentialChain
chat = AzureChatOpenAI(deployment_name="gpt-35-turbo", temperature=0)
prompt1 = PromptTemplate(
input_variables=["base"], template="{base}的平方是: "
)
chain1 = LLMChain(llm=chat, prompt=prompt1)
prompt2 = PromptTemplate(input_variables=["input"], template="将{input}写成罗马数字是:")
chain2 = LLMChain(llm=chat, prompt=prompt2)
overall_chain = SimpleSequentialChain(chains=[chain1, chain2], verbose=True)
overall_chain.run(3)
> Entering new chain...
9
IX
> Finished chain.
'IX'
LangChain has pre-prepared many combinations of different Chains. For details, please refer to the official documentation, so I won’t expand here.