【Langchain】Efficient plug-in for GPT

Function 1: Let the model have memory

It can be seen that as a plug-in for accessing gpt, langchain allows the gpt model to record the previous conversations through the memory variable, so that the model has memory (without changing the model parameters).
insert image description here
Check what the memory variable contains: You can set the length of the memory (remember the previous conversations)
insert image description here
through the parameter k (or parameters such as max_tokens):
insert image description here

Why use langchain

Large models do not have the concept of "state" (compared to models such as lstm).
insert image description here
The langchain plugin can solve this problem.

Function 2: contextual connection

That is, the next dialog (output) needs some information from the previous dialog (as input).

normal chain

Suppose a csv file is read:
insert image description here

The function llmchain needs to accept two variables, model and prompt:
insert image description here

Sequetial Chain

insert image description here
In this type of chain, the dialogue before and after has an input-output relationship.

Let's see how to use it:

insert image description here
The output is as follows:
insert image description here
The schematic diagram of the simple chain is as follows:
insert image description here

When multiple conversations are mixed, you can do this:

insert image description here
insert image description here
The schematic diagram at this time is as follows:
insert image description here
insert image description here

Router Chain

It is used for processing in different fields, and different prompts are given for different fields (you need to define which kinds of prompts you have in advance).
insert image description here
For example, when a teacher with multiple subjects needs to answer questions in a certain field, first define prompts for different subjects:

insert image description here
Then define all promt information:
insert image description here
then start calling MultiPromptChainfunctions such as:
insert image description here
also define the default chain:
insert image description here
insert image description here

Function 3: Answer questions according to the document

Need to use embedding and vector storage (Vector Database) technology to perform retrieval, matching and other operations in the uploaded documents.

How does Kangkang achieve it? Suppose a csv file is uploaded, and gpt needs to answer questions based on the uploaded file.

insert image description here

But sometimes when the document is very large, gpt cannot input such a long content (the token input by the LLM model is limited): at insert image description here
this time, embedding and vector storage technology must be introduced.

embedding

That is, using vector to express the content of a certain paragraph, when the similarity between the two sentences is relatively high, their corresponding embedding results are also very similar:
insert image description here

Vector Database

Embedding each piece of content, and then storing their embeddings according to the index.
insert image description here
When entering a query to query, the query is also embbedding, and then query the similarity between it and the embedding of all sentences stored before, and select a few sentences with relatively high similarity to answer:
insert image description here

Code

Load a document:
insert image description here
Check the usage of embedding, you can see that it has 1536 latitudes:
insert image description here
Create a vector database and find sentences with high similarity:
insert image description here
Four documents are returned:
insert image description here
Retrieve the required (high similarity) content:
insert image description here

Guess you like

Origin blog.csdn.net/weixin_42468475/article/details/131461960
GPT