実践シリーズ！ Zilliz Cloud と AWS Bedrock を使用して RAG アプリケーションを構築する

オープンソースの中国コミュニティチームは、共有の名のもとに、オープンソースの中国コミュニティの背後にあるストーリーを伝える初のライブブロードキャストを行いました。」

検索拡張生成 (RAG) は、情報検索と自然言語処理 (NLP) 機能を組み合わせてテキスト生成を強化する AI フレームワークです。具体的には、RAG システムの言語モデルは、生成された応答に最新の情報を組み込む検索メカニズムを通じて、ナレッジベースまたは外部データベースにクエリと検索を実行し、最終出力をより正確にし、より多くのコンテキストを含むようにします。

Zilliz Cloud ( https://zilliz.com.cn/cloud) は、 Milvus ( https://milvus.io/) ベクトルデータベース上に構築されており、大規模なベクトル化データを保存および処理するためのソリューションを提供します。効率的な管理と分析、データの取得。開発者は、 Zilliz Cloud のベクトルデータベース機能を使用して、大規模な埋め込みベクトルを保存および検索し、RAG アプリケーションの検索モジュール機能をさらに強化できます。

AWS Bedrock クラウドサービス ( https://aws.amazon.com/cn/bedrock/) は、 NLP ソリューションのデプロイと拡張に使用できるさまざまな事前トレーニング済みの基本モデルを提供します。開発者は、AWS Bedrock を通じて、言語の生成、理解、翻訳のモデルを AI アプリケーションに統合できます。さらに、AWS Bedrock は、テキストに対して関連性が高くコンテキストに富んだ応答を生成できるため、RAG アプリケーションの機能がさらに向上します。

01. Zilliz Cloud と AWS Bedrock を使用して RAG アプリケーションを構築する

AWS Bedrock で Zilliz Cloud を使用して RAG アプリケーションを構築する方法を示します。基本的なプロセスを図 1 に示します。

図 1. Zilliz Cloud と AWS Bedrock を使用して RAG アプリケーションを構築する基本プロセス

#download the packages then import them
! pip install --upgrade --quiet  langchain langchain-core langchain-text-splitters langchain-community langchain-aws bs4 boto3

# For example
import bs4
import boto3

AWS Bedrock と Zilliz Cloud に接続する

次に、AWS および Zilliz Cloud サービスへの接続に必要な環境変数を設定します。 AWS Bedrock および Zilliz Cloud サービスに接続するには、AWS サービスリージョン、アクセスキー、Zilliz Cloud のエンドポイント URI と API キーを指定する必要があります。

# Set the AWS region and access key environment variables
REGION_NAME = "us-east-1"
AWS_ACCESS_KEY_ID = os.getenv("AWS_ACCESS_KEY_ID")
AWS_SECRET_ACCESS_KEY = os.getenv("AWS_SECRET_ACCESS_KEY")

# Set ZILLIZ cloud environment variables
ZILLIZ_CLOUD_URI = os.getenv("ZILLIZ_CLOUD_URI")
ZILLIZ_CLOUD_API_KEY = os.getenv("ZILLIZ_CLOUD_API_KEY")

上記で提供されたアクセス認証情報を使用して、 AWS Bedrock Runtime サービスに接続し、AWS Bedrock 言語モデルを統合するためのboto3 クライアント ( https://boto3.amazonaws.com/v1/documentation/api/latest/index.html)を作成しました。。次に、ChatBedrock インスタンス ( https://python.langchain.com/v0.1/docs/integrations/chat/bedrock/) を初期化し、クライアントに接続し、使用する言語モデルを指定します。このチュートリアルで使用する anthropic.claude-3-sonnet-20240229-v1:0モデル。このステップは、テキスト応答を生成するためのインフラストラクチャをセットアップするのに役立ち、また、生成される応答の多様性を制御するためにモデルの温度パラメーターを構成します。 BedrockEmbeddings インスタンスを使用して、テキストなどの非構造化データを変換できます ( https://zilliz.com.cn/glossary/%E9%9D%9E%E7%BB%93%E6%9E%84%E5%8C%96% E6 %95%B0%E6%8D%AE) をベクトルに変換します。

# Create a boto3 client with the specified credentials
client = boto3.client(
    "bedrock-runtime",
    region_name=REGION_NAME,
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
)

# Initialize the ChatBedrock instance for language model operations
llm = ChatBedrock(
    client=client,
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    region_name=REGION_NAME,
    model_kwargs={"temperature": 0.1},
)

# Initialize the BedrockEmbeddings instance for handling text embeddings
embeddings = BedrockEmbeddings(client=client, region_name=REGION_NAME)

情報の収集と処理

埋め込みモデルが正常に初期化されたら、次のステップは外部ソースからデータをロードすることです。 WebBaseLoader インスタンス ( https://python.langchain.com/v0.1/docs/integrations/document_loaders/web_base/) を作成して、指定された Web ソースからコンテンツをクロールします。

このチュートリアルでは、AI エージェント関連の記事からコンテンツを読み込みます。ローダーは、BeautifulSoup (https://www.crummy.com/software/BeautifulSoup/bs4/doc/) のSoupStrainer を使用して、Web ページの特定の部分、つまり「post-content」、「post-title」、「」を解析します。 post" -header" セクションを使用して、関連するコンテンツのみが取得されるようにします。次に、ローダーは指定されたネットワークソースからドキュメントを取得し、後続の処理のために関連コンテンツのリストを提供します。次に、RecursiveCharacterTextSplitter インスタンス ( https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/recursive_text_splitter/) を使用して、取得したドキュメントを小さなテキストの塊に分割します。これにより、コンテンツがより管理しやすくなり、これらのテキストブロックをテキスト埋め込みや言語生成モジュールなどの他のコンポーネントに渡すこともできます。

# Create a WebBaseLoader instance to load documents from web sources
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
# Load documents from web sources using the loader
documents = loader.load()
# Initialize a RecursiveCharacterTextSplitter for splitting text into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)


# Split the documents into chunks using the text_splitter
docs = text_splitter.split_documents(documents)

応答を生成する

プロンプトテンプレートは各応答の構造を事前に定義しており、AI が可能な場合は統計と数値を使用し、関連する知識が不足している場合は回答をでっち上げることを避けることができます。

# Define the prompt template for generating AI responses
PROMPT_TEMPLATE = """
Human: You are a financial advisor AI system, and provides answers to questions by using fact based and statistical information when possible.
Use the following pieces of information to provide a concise answer to the question enclosed in <question> tags.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
<context>
{context}
</context>

<question>
{question}
</question>

The response should be specific and use statistics or numbers when possible.

Assistant:"""
# Create a PromptTemplate instance with the defined template and input variables
prompt = PromptTemplate(
    template=PROMPT_TEMPLATE, input_variables=["context", "question"]
)

Zilliz ベクターストアを初期化し、Zilliz クラウドプラットフォームに接続します。ベクトルストアは、後でドキュメントを迅速かつ効率的に取得できるように、ドキュメントをベクトルに変換する役割を果たします。取得された文書はフォーマットされて一貫したテキストに整理され、AI が関連情報を回答に統合して、最終的には非常に正確で関連性の高い回答を提供します。

# Initialize Zilliz vector store from the loaded documents and embeddings
vectorstore = Zilliz.from_documents(
    documents=docs,
    embedding=embeddings,
    connection_args={
        "uri": ZILLIZ_CLOUD_URI,
        "token": ZILLIZ_CLOUD_API_KEY,
        "secure": True,
    },
    auto_id=True,
    drop_old=True,
)

# Create a retriever for document retrieval and generation
retriever = vectorstore.as_retriever()

# Define a function to format the retrieved documents
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

最後に、AI 応答を生成するための完全な RAG リンクを作成します。このリンクは、まずユーザークエリに関連するドキュメントをベクターストアから取得し、取得してフォーマットしてから、プロンプトテンプレート ( https://python.langchain.com/v0.1/docs/modules/model_io/prompts)に渡します。 /) 応答構造を生成します。この構造化された入力は言語モデルに渡されて一貫した応答が生成され、最終的に文字列形式に解析されてユーザーに提示され、正確でコンテキストに富んだ応答が提供されます。

# Define the RAG (Retrieval-Augmented Generation) chain for AI response generation
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# rag_chain.get_graph().print_ascii()

# Invoke the RAG chain with a specific question and retrieve the response
res = rag_chain.invoke("What is self-reflection of an AI Agent?")
print(res)

以下は応答結果の例です。

Self-reflection is a vital capability that allows autonomous AI agents to improve iteratively by analyzing and refining their past actions, decisions, and mistakes. Some key aspects of self-reflection for AI agents include:
1. Evaluating the efficiency and effectiveness of past reasoning trajectories and action sequences to identify potential issues like inefficient planning or hallucinations (generating consecutive identical actions without progress).
2. Synthesizing observations and memories from past experiences into higher-level inferences or summaries to guide future behavior.

02. Zilliz CloudとAWS Bedrockを利用するメリット

表 1 に示すように、Zilliz Cloud は AWS Bedrock とシームレスに統合して、RAG アプリケーションの効率、スケーラビリティ、精度を向上させることができます。開発者は、これら 2 つのサービスを使用して、大量のデータセットを処理し、RAG アプリケーションプロセスを簡素化し、RAG によって生成される応答の精度を向上させる包括的なソリューションを開発できます。

表 1. Zilliz Cloud と AWS Bedrock を使用する利点

03. 概要

この記事では主に、Zilliz Cloud と AWS Bedrock を使用して RAG アプリケーションを構築する方法を紹介します。

Milvus 上に構築されたベクトルデータベースである Zilliz Cloud は、埋め込みベクトル用のスケーラブルなストレージおよび取得ソリューションを提供し、AWS Bedrock は言語生成用の強力な事前トレーニング済みモデルを提供します。サンプルコードを通じて、Zilliz Cloud と AWS Bedrock に接続し、外部ソースからデータをロードし、データを処理して分割し、最後に完全な RAG リンクを構築する方法を示します。この記事で構築された RAG アプリケーションは、LLM が幻覚を引き起こし、不正確な応答を提供する可能性を最小限に抑え、最新の NLP モデルとベクトルデータベース間の相乗効果を最大限に発揮します。このチュートリアルが、他の人が RAG アプリケーションを構築する際に同様のテクニックを使用するきっかけとなることを願っています。