Customize your own documentation Q&A bot

Recently, ChatGPT is very popular, and its functions are very powerful. It has strong logical reasoning ability and data background. But if we want to use ChatGPT to chat about some knowledge it has not trained, or some of our own data, because ChatGPT has not learned this knowledge, the answer is not accurate.

The following will introduce a way to customize private chatbots based on the data you provide based on llama-index and ChatGPT API.

exploration method

1. If you want to customize a robot based on your own exclusive data, the first thing you think of is the fine-tuning method. Fine-tuning the GPT model with a large amount of data to achieve a model that can understand the documents you provide. However, fine-tuning costs a lot of money and requires a large dataset with instances. It's also not possible to fine-tune every time a file changes. The more critical point is that it is impossible for the model to "know" all the information in the document for fine-tuning, but to teach the model a new skill. Therefore, fine-tuning is not a good idea.

2. Use your private text content as the context of the prompt to ask questions on ChatGPT. However, there is a limit on the maximum length of the openai api. The maximum number of tokens in ChatGPT 3.5 is 4096. If the length limit is exceeded, the document will be truncated directly, and there is a problem of context loss. And the cost of calling the API is proportional to the length of the token. If the number of tokens is too large, the cost of each call will be high.

Since tokens are limited, is there any tool for preprocessing the text content so that the number of tokens does not exceed the limit. llama-index is one such tool. With llama-index, only the relevant parts can be extracted from the text, and then fed back to the prompt.

Next I will give a step-by-step tutorial on implementing a Q&A chatbot based on your own data using llama-index and the ChatGPT API.

Early preparation:

  • The OpenAI API key can be viewed at https://platform.openai.com/account/api-keys. If you haven't applied yet, you can refer to the information to apply for the OpenAI API key. Use the OpenAI API key to interact with various models provided by openai.

  • A document database. llama-index supports many different data sources such as API, PDF, Documentation, SQL, Google Docs, etc. In this tutorial, we just use a simple text file for demonstration.

  • Local Python environment or online Google Colab. This tutorial uses a local Python environment for demonstration.

process:

Installation dependencies:

pip install openai
pip install llama-index
from llama_index import SimpleDirectoryReader, GPTListIndex, GPTSimpleVectorIndex, LLMPredictor, PromptHelper,ServiceContext
from langchain import OpenAI 
import gradio as gr 
import sys 
import os 
os.environ["OPENAI_API_KEY"] = 'your openai api key'
data_directory_path = 'your txt data directory path'
index_cache_path = 'your index file path'

#构建索引
def construct_index(directory_path): 
        max_input_size = 4096 
        num_outputs = 2000 
        max_chunk_overlap = 20 
        chunk_size_limit = 500
      
        llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003", max_tokens=num_outputs))
        # 按最大token数500来把原文档切分为多个小的chunk
        service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, chunk_size_limit=chunk_size_limit)
        # 读取directory_path文件夹下的文档
        documents = SimpleDirectoryReader(directory_path).load_data() 
 
        index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)
        # 保存索引
        index.save_to_disk(index_cache_path) 
        return index 
        
def chatbot(input_text): 
        # 加载索引
        index = GPTSimpleVectorIndex.load_from_disk(index_cache_path) 
        response = index.query(input_text, response_mode="compact") 
        return response.response 
        
if __name__ == "__main__":
        #使用gradio创建可交互ui  
        iface = gr.Interface(fn=chatbot, 
                        inputs=gr.inputs.Textbox(lines=7, label="Enter your text"), 
                        outputs="text", 
                        title="Text AI Chatbot") 
        index = construct_index(data_directory_path) 
        iface.launch(share=True)

In the construct_index method, use the related methods of llama_index to read the txt document under the data_directory_path path, and generate an index file to store in the index_cache_path file. When this python file is executed, the construct_index method will be executed and output in the console:

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 27740 tokens
Running on local URL:  http://127.0.0.1:7860

You can see that the output original document has 27740 tokens, which is also the calling cost of requesting the embedding interface. This token is generated by llama_index and will not occupy the token of the ChatGPT api.

Then enter the url of the console output in the browser: http://127.0.0.1:7860. The ui rendered by the gradio framework will be displayed, as follows:

633f20d508b7a3fe49ef5f7e5133bc52.png

Enter on the left What did the author do in 9th grade?and output on the right as follows:

cd4d727bc818b08bbd79c9419afc2923.png

Simultaneous console output

INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 563 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 10 tokens

The token spent using the OpenAI text-davinci-003 model is 563 tokens. Through this method, the query call cost, which was close to 28,000 tokens, was reduced to about 500 tokens.

llama-index works as follows:

  • Create text block index

  • Find the most relevant chunk of text

  • Ask GPT-3 (or other OpenAI models) questions with relevant text blocks

  • When calling the query interface, llama-index will construct the following prompt by default:

"Context information is below. \n"
    "---------------------\n"
    "{context_str}"
    "\n---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the question: {query_str}\n"
  • When using the above prompt to request the openai model, the model uses its logical reasoning ability to get the answer we want according to the context we provide and the questions we ask.

extension:

The above shows how to use llama-index to associate txt text documents and use the reasoning ability of the ChatGPT model for question and answer. We can extend it to other ways of usage.

The llama-index library can not only link txt text documents, but also provides a large number of various types of DataConnectors, including e-book formats such as PDF and ePub, as well as external data sources such as YouTube, Notion, MongoDB, and API access data, or data from a local database. You can see the built-in supported link data types in the open source library (https://github.com/jerryjliu/llama_index/blob/main/gpt_index/readers/file/base.py), or in llamahub.ai (https:/ /llamahub.ai/) see the DataConnector developed by the community to read various data source formats.

With the help of links between llama-index and pdf documents, we can achieve a function similar to chatpdf (https://www.chatpdf.com/). You can also use llama-index's ImageParser to identify pictures and communicate with ChatGPT about the content in the pictures...   More usage scenarios are waiting for your discovery and expansion.

Summarize:

In this article, we combine ChatGPT with llama-index to build a chatbot for document question answering. While ChatGPT (and other LLMs) are powerful on their own, its power is greatly amplified when we combine it with other tools, data, or processes. Through the introduction of this article, I hope that you can also hand over your own data set to AI for indexing, and get an AI robot exclusively for you.

reference resources

https://github.com/jerryjliu/llama_index

https://zhuanlan.zhihu.com/p/613155165

https://www.wbolt.com/building-a-chatbot-based-on-documents-with-gpt.html

https://time.geekbang.org/column/article/645305

The document used in this article is the document of llama_index example: https://github.com/jerryjliu/llama_index/blob/c811e2d4775b98f5a7cf82383c876018c4f27ec4/examples/paul_graham_essay/data/paul_graham_essay.txt

- END -

About Qi Wu Troupe

Qi Wu Troupe is the largest front-end team of 360 Group, and participates in the work of W3C and ECMA members (TC39) on behalf of the group. Qi Wu Troupe attaches great importance to talent training, and has various development directions such as engineers, lecturers, translators, business interface people, and team leaders for employees to choose from, and provides corresponding technical, professional, general, and leadership training course. Qi Dance Troupe welcomes all kinds of outstanding talents to pay attention to and join Qi Dance Troupe with an open and talent-seeking attitude.

7aceb666dde7fb7370d769f3ba3233a0.png

Guess you like

Origin blog.csdn.net/qiwoo_weekly/article/details/130278688