Continuing from the previous chapter, how to build a chatbot using LlamaIndex?

LlamaIndex is a leading open source data retrieval framework that can take advantage of various applications. One typical application is to build chat robots within enterprises.

For enterprises, as the number of documents continues to increase, document management will become increasingly difficult. Therefore, many companies will build chatbots based on internal knowledge bases. During the build process, you need to pay attention to three key points: how to cut data, what metadata to save, and how to route queries.

01. Why use LlamaIndex to build a chatbot?

In the previous article, we used Zilliz Cloud (fully managed Milvus cloud service) to build a basic retrieval enhancement generation (RAG) ( https://zilliz.com/use-cases/llm-retrieval-augmented- generation ) chatbot. In this tutorial, we can continue to use the free version of Zilliz Cloud. You can also use your own Milvus ( https://milvus.io/ ) instance to quickly start and use Milvus Lite ( https://milvus.io/ ) in the notebook. docs/milvus_lite.md ).

In the previous article, we cut the article into many small text blocks. When a simple search is performed with the question "What is a large language model?", the text returned is a text block that is semantically similar to the question, but does not answer the question. Therefore, in this project, we use the same vector database as the backend but use a different retrieval process to further obtain better Q&A results. In the project, we will use LlamaIndex to achieve efficient retrieval.

LlamaIndex ( https://zilliz.com/product/integrations/Llamaindex ) is a framework that helps us process data on top of large language models. One of the main abstractions provided by LlamaIndex is the "index". An index is a model of data distribution. On this basis, LlamaIndex also provides the ability to transform these indexes into query engines, which utilize large language models and embedding models to organize efficient queries and retrieve relevant results.

02.The role of LlamaIndex and Milvus in Chat Towards Data Science

So, how does LlamaIndex help us coordinate data retrieval? How does Milvus help build chatbots? We can use Milvus as the backend for the persistent vector store of LlamaIndex . Using a Milvus or Zilliz Cloud instance, you can switch from a Python-native, uncoordinated application to a LlamaIndex-driven retrieval application.

Set up notebook, use Zilliz and LlamaIndex

As mentioned in the previous article, for this series of projects Chat Towards Data Science | How to build a RAG chatbot using a personal data knowledge base? (Above) , we choose Zilliz Cloud. The steps to connect to Zilliz Cloud are basically the same as connecting to Milvus. For information on how to connect to Milvus and store Milvus as a local vector, see the example comparing vector embeddings.

In the notebook, we need to install three libraries, pip install llama-index python-dotenv openaiinstall them through , and use python-dotenvmanagement environment variables.

After getting the import, you need to load_dotenv()load .envthe file. The three environment variables required for this project are the OpenAI API key, the URI of the Zilliz Cloud cluster, and the token of the Zilliz Cloud cluster.

! pip install llama-index python-dotenv openai
import osfrom dotenv import load_dotenv
import openai

load_dotenv()

openai.api_key = os.getenv("OPENAI_API_KEY")

zilliz_uri = os.getenv("ZILLIZ_URI")
zilliz_token = os.getenv("ZILLIZ_TOKEN")

Bring existing Collection into LlamaIndex

There are some minor challenges in bringing an existing collection into LlamaIndex. LlamaIndex has its own structures for creating and accessing vector database collections, but they are not used directly here. The main difference between the native LlamaIndex vector storage interface and bringing your own model is how the embedding vector and metadata are accessed. To implement this tutorial, I also wrote some code and contributed it to the LlamaIndex ( https://github.com/run-llama/llama_index/commit/78ed06c95313e933cc255ac17bcd592e3f4b2be1 ) project!

LlamaIndex uses OpenAI's embedding by default, but we generated the embedding using the HuggingFace model. Therefore, the correct embedding model must be passed in. Additionally, a different field is used to store the text this time, we use "paragraph" whereas LlamaIndex uses "_node_content" by default.

This part requires importing four modules from LlamaIndex. First, you need MilvusVectorStoreto use Milvus and LlamaIndex. We also need VectorStoreIndexmodules to use Milvus as the vector storage index and the ServiceContext module to pass in the services we want to use. Finally, import HuggingFaceEmbeddingthe module so you can use the open source embedding model from Hugging Face.

As for getting the embedding model, we only need to declare a HuggingFaceEmbedding object and pass in the model name. The MiniLM L12 model is used in this tutorial. Next, create a ServiceContext object so that the embedding model can be passed.

from llama_index.vector_stores import MilvusVectorStore
from llama_index import VectorStoreIndex, ServiceContext
from llama_index.embeddings import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L12-v2")
service_context = ServiceContext.from_defaults(embed_model=embed_model)

Of course, we also need to connect to the Milvus vector store. In this step we pass 5 parameters: the URI of our Collection, the token to access our Collection, the Collection name used (the default is "Llamalection"), the similarity type used, and the key corresponding to which metadata field stores the text. .

vdb = MilvusVectorStore(
    uri = zilliz_uri,
    token = zilliz_token,
    collection_name = "tds_articles",
    similarity_metric = "L2",
    text_key="paragraph"
)

Query Milvus Collection using LlamaIndex

Now that we have connected to the existing Milvus Collection and pulled the required models, let's talk about how to query.

First, create a ServiceContext object so that you can pass the Milvus vector database. Then, convert the Milvus Collection into a vector storage index. This is also where the embedding model is passed in via the ServiceContext object created above.

After you have an initialized vector storage index object, you only need to call as_query_engine()a function to convert it into a query engine. In this tutorial, we compare direct semantic search to using the LlamaIndex query engine by using the same question as before: "What is a large language model?"

vector_index = VectorStoreIndex.from_vector_store(vector_store=vdb, service_context=service_context)
query_engine = vector_index.as_query_engine()
response = query_engine.query("What is a large language model?")

To make the output easier to read, I imported pprintand used it to print the response.

from pprint import pprint
pprint(response)

Below is the response we got from searching using LlamaIndex, which is much better than a simple semantic search:

03. Summary

This time, we used LlamaIndex and the existing Milvus Collection to improve the chatbot built in the previous article. The previous version used simple semantic similarity to find answers via vector search, but the results were not very good. In comparison, using LlamaIndex to build a query engine returns better results.

The biggest challenge of this project is how to bring in the existing Milvus Collection. The existing Collection does not use default values ​​for embedding vector dimensions, nor does it use default values ​​for metadata fields used to store text. The solution to these two points is to pass the specific embedding model through the ServiceContext and define the correct text field when creating the Milvus Vector Store object.

After creating the vector storage object, use Hugging Face embedding to convert it into an index, and then convert the index into the query engine. Query engines leverage LLM to understand questions, collect responses, and return better responses.

Alibaba Cloud suffered a serious failure and all products were affected (restored). Tumblr cooled down the Russian operating system Aurora OS 5.0. New UI unveiled Delphi 12 & C++ Builder 12, RAD Studio 12. Many Internet companies urgently recruit Hongmeng programmers. UNIX time is about to enter the 1.7 billion era (already entered). Meituan recruits troops and plans to develop the Hongmeng system App. Amazon develops a Linux-based operating system to get rid of Android's dependence on .NET 8 on Linux. The independent size is reduced by 50%. FFmpeg 6.1 "Heaviside" is released
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4209276/blog/10140566