5 minutes, combined with LangChain to build your own generative intelligent question answering system

With the emergence of large language models (LLM, Large Language Model), people have found that generative artificial intelligence is of great significance in many fields, such as image generation, writing documents, information search, etc. With the diversification of LLM scenarios, everyone hopes that LLM can play its powerful functions in the vertical field. However, due to the training and timeliness limitations of large models in specific domain datasets, when building vertical domain products based on LLM, it is necessary to input specific knowledge bases into large models for training or reasoning.

There are two commonly used methods: Fine-Tuning and Prompt-Tuning. The former is to further train the existing model through the new data set, which has high training cost and poor timeliness. The latter is more flexible in terms of training cost and timeliness.

This article will introduce how to build an exclusive intelligent question answering system based on the volcano engine cloud search service and the Ark platform based on the prompt learning method. Using embedding technology (embedding), the content of the dataset is converted into vectors through the embedding model, and then these vectors and data are saved with the help of the vector search capability of the volcano engine cloud search service ESCloud . In the query stage, through the similarity query, the associated topK results are matched, and then these results are provided to LLM supplemented by prompt words, and finally the corresponding answers are generated. Here, a large model will be selected from the large model square of the volcano engine Ark platform as the LLM to infer the answer. The open source framework LangChain is selected as the application framework for building an end-to-end language model to simplify the link of the entire chat model.

cloud search VectorStore preparation

  1. Log in to the Volcano Engine cloud search service, create an instance cluster, and select 7.10 for the cluster version.
  2. Select the appropriate model in the large model square of the volcano engine Ark platform, and check the API call instructions

  1. Mapping preparation
PUT langchain_faq
{
  "mappings": {
    "properties": {
      "message": { "type": "text" },
      "message_embedding": { "type": "knn_vector", "dimension": 768 },
      "metadata": { "type": "text" }
    }
  },
  "settings": {
    "index": {
      "refresh_interval": "10s",
      "number_of_shards": "3",
      "knn": true,
      "knn.space_type": "cosinesimil",
      "number_of_replicas": "1"
    }
  }
}

Client ready

  1. Dependency installation
pip install volcengine --user
pip install langchain --user
  1. initialization
#Embedding
from langchain.embeddings import HuggingFaceEmbeddings
#VectorStore
from langchain.vectorstores import OpenSearchVectorSearch
#LLM Base
from langchain.llms.base import LLM
#Document loader
from langchain.document_loaders import WebBaseLoader
#LLM Cache
from langchain.cache import InMemoryCache
#Volcengine
from volcengine.ApiInfo import ApiInfo
from volcengine import Credentials
from volcengine.base.Service import Service
from volcengine.ServiceInfo import ServiceInfo

import json
import os
from typing import Optional, List, Dict, Mapping, Any

#加载Embeddings,这里使用huggingFace 作为embedding
embeddings = HuggingFaceEmbeddings()

# 启动llm的缓存
llm_cache = InMemoryCache()

MaaS preparation

We select a model from the Volcano Engine Ark large model platform. This step can be seen in the API call in the upper right corner after selecting the model.

maas_host = "maas-api.ml-platform-cn-beijing.volces.com"
api_chat = "chat"
API_INFOS = {api_chat: ApiInfo("POST", "/api/v1/" + api_chat, {}, {}, {})}

class MaaSClient(Service):
    def __init__(self, ak, sk):
        credentials = Credentials.Credentials(ak=ak, sk=sk, service="ml_maas", region="cn-beijing")
        self.service_info = ServiceInfo(maas_host, {"Accept": "application/json"}, credentials, 60, 60, "https")
        self.api_info = API_INFOS
        super().__init__(self.service_info, self.api_info)

client = MaaSClient(os.getenv("VOLC_ACCESSKEY"), os.getenv("VOLC_SECRETKEY"))

#引入LLM Base,构造Volc GLM Client, 用于和LLM 对话
from langchain.llms.base import LLM
class ChatGLM(LLM):
    @property
    def _llm_type(self) -> str:
        return "chatglm"
    def _construct_query(self, prompt: str) -> Dict:
        query = "human_input is: " + prompt
        return query
    @classmethod
    def _post(cls, query: Dict) -> Any:
        request = ({
            "model": {
                "name": "chatglm-130b"
            },
            "parameters": {
                "max_tokens": 2000,
                "temperature": 0.8
            },
            "messages": [{
                "role": "user",
                "content": query
            }]
        })
        print(request)
        resp = client.json(api=api_chat, params={}, body=json.dumps(request))
        return resp
    def _call(self, prompt: str, 
        stop: Optional[List[str]] = None) -> str:
        query = self._construct_query(prompt=prompt)
        resp = self._post(query=query)
        return resp

write data set

这里我们利用 LangChain 的 Loader 导入一些 Web 的数据集,然后利用 HuggingFaceEmbeddings (768 维度)生成特征值。用 VectorStore 写入云搜索服务 ESCloud 的向量索引。

# Document loader
from langchain.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()
# Split
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)
all_splits = text_splitter.split_documents(data)
#Embeddings
from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings()
#VectorStore 
# URL 为云搜索VectorStore的访问URL,
# http_auth 为访问云搜索的用户密码
from langchain.vectorstores import OpenSearchVectorSearch
vectorstore = OpenSearchVectorSearch.from_documents(
        documents = all_splits,
        embedding = HuggingFaceEmbeddings(),
        opensearch_url = "URL", 
        http_auth = ("user", "password"),
        verify_certs = False,
        ssl_assert_hostname = False,
        index_name = "langchain_faq",
        vector_field ="message_embedding",
        text_field = "message",
        metadata_field = "message_metadata",
        ssl_show_warn = False,)

查询 + Retriever

query = "What are the approaches to Task Decomposition?"
docs = vectorstore.similarity_search(
        query,
        vector_field="message_embedding",
        text_field="message",
        metadata_field="message_metadata",)
retriever = vectorstore.as_retriever(search_kwargs={"vector_field": "message_embedding", "text_field":"message", "metadata_field":"message_metadata"})        

LLM Chat

这里选择了大模型平台中的 ChatG**

调用 ChatAPI,这里会使用 LangChain 自带的 Prompt,结合 Query,给 LLM 然后发送出去。

from langchain.chains import RetrievalQA
llm = ChatGLM()
retriever = vectorstore.as_retriever(search_kwargs={"vector_field": "message_embedding", "text_field":"message", "metadata_field":"message_metadata"})
qa_chain = RetrievalQA.from_chain_type(llm,retriever=retriever)
qa_chain({"query": query})

调试可以看到提示词:

回答:

以上就是基于火山引擎云搜索服务和方舟平台构建专属智能问答系统的实践,欢迎大家登陆火山引擎控制台操作!


云搜索服务 ESCloud 兼容 Elasticsearch、Kibana 等软件及常用开源插件,提供结构化、非结构化文本的多条件检索、统计、报表,可以实现一键部署、弹性扩缩、简化运维,快速构建日志分析、信息检索分析等业务能力。

了解更多产品详情:www.volcengine.com/product/es

Guess you like

Origin juejin.im/post/7257705589140783165