ChatGPT Plugins Inside Story, Source Code and Case Practice

ChatGPT Plugins Inside Story, Source Code and Case Practice

6.1 How ChatGPT Plugins works

This section mainly talks about the Plugins of ChatGPT, which is very important. Nowadays, many enterprise-level developments generally encapsulate some services based on the ChatGPT plug-in, which is equivalent to developing an agent (Agent), encapsulating some services or APIs in it, and then when calling, use the agent directly instead of calling the service directly. The agent will handle things internally. This section will focus on several aspects. The first aspect is to share with you how the ChatGPT plug-in works internally. This is definitely crucial. For a developer or a technical decision-maker, if you don’t know how it works, you will have no way to talk about many other things. This is the first point to talk to you about.

The author (Gavin big coffee WeChat: NLP_Matrix_Space) mainly uses GPT-4 on the official website of OpenAI, and has installed some plug-ins. When you click, you will see these plug-ins, as shown in Figure 6-1.

3b0e2bdc6239dc10f583865a053bc507.jpeg

Figure 6- 1 Schematic diagram of the ChatGPT plug-in

As shown in Figure 6-2, if you click Plugin store, you will enter the plugin store. Many people in the industry have a very high evaluation of the ChatGPT plug-in. Their evaluations will talk about things from different points of entry, but there is a common logic at the bottom, that is, the ChatGPT plug-in will become a new generation of large-scale model-driven application stores. As for the application store, for example: Apple's APP store or Google's Google Play, compare the ChatGPT plug-in with them, you should know its core meaning, and the most important thing is that you can develop plug-ins yourself, and developing plug-ins is very easy.

79330932b11aa95073615eb360eaf352.png

Figure 6- 2 Plug-in store

Going back to the OpenAI official website page, the official website talks about ChatGPT plug-ins. OpenAI has implemented preliminary support for plug-ins in ChatGPT. Plug-ins are tools designed for language models with security as the core principle, which can help ChatGPT access the latest information, run calculations or use third-party services. Today's language models, while useful for a variety of tasks, are still limited, the only information they can learn is their training data, which can become obsolete, and a one-size-fits-all application. But plugins can be the "eyes and ears" of language models, giving them access to up-to-date and personalized information that wasn't included in the training data. In response to explicit user requests, plugins can also make language models perform safe, restricted operations on their behalf, increasing the overall usefulness of the system. The plug-in is specially designed for the language model, so you don't need to pay too much attention to the content of its underlying specific security design. With this plugin, you can access developed services, or third-party libraries, etc. The plug-in is an open external service of the entire OpenAI, and it is a milestone realization. Here is a very classic saying: ChatGPT itself is magic, but the plug-in is purer magic. The training data of ChatGPT is before September 2021. To access future data, especially the private data of the enterprise, the ChatGPT plug-in can achieve this; further, the way ChatGPT interacts with the user, if the user wants to perform some actions, they can take actions on the icon of the application in ChatGPT, and the ChatGPT plug-in can implement it. This seems very exciting, but from the perspective of the specific architecture, it is actually very simple. The address of the agent (Agent) process encapsulates the API service or tool, as well as the Open API Definition (Open API Definition) and the plug-in manifest (Plugin Manifest), as shown in Figure 6-3.

6109a499e284571d85471d5257f3aeea.jpeg

Figure 6- 3 ChatGPT plug-in architecture

6.2 ChatGPT Retrieval Plugin

source code analysis

Services

and

local server

As shown in Figure 6-4, it is the ChatGPT search plug-in code directory. Please take a look at this code.

48bdba860c9f55c1b048af38f9bbba49.png

Figure 6- 4 ChatGPT search plug-in code directory

The ChatGPT retrieval plug-in library uses natural language query to provide a flexible solution for the semantic retrieval of documents. The directory description of the ChatGPT retrieval plug-in code is shown in Table 6-1.

Table of contents

describe

datastore

Contains the core logic for storing and querying document embeddings using various vector databases.

docs

Includes documentation on setting up and using each vector database provider, webhook, and removing unused dependencies.

examples

Provides sample configurations, authentication methods, and provider-specific examples.

local_server

Contains the implementation of the retrieval plugin configured for localhost testing.

models

Contains data models used by plugins, such as document and metadata models.

scripts

Provides scripts for processing and uploading documents from different data sources.

server

Contains the main FastAPI server implementation.

services

Contains utility services for tasks such as chunking, metadata extraction, and instrumentation.

tests

Includes integration tests for various vector database providers.

.well-known

Stores plugin manifest files and OpenAPI schemas, defining plugin configuration and API specifications.

Table 6- 1 Directory description of ChatGPT search plug-in

The ChatGPT plugin is a chat extension specifically designed for large language models, enabling them to access up-to-date information, run computations, or interact with third-party services in response to user requests. Developers can expose an API through a website and provide a standardized manifest file describing the API to create plugins. ChatGPT uses these files and allows AI models to call developer-defined APIs.

A plugin consists of the following::

    • an API interface

    • An API schema (available in OpenAPI JSON or YAML format)

    • A manifest (JSON file) that defines plugin-related metadata

The ChatGPT retrieval plug-in explained in this section supports semantic search and retrieval of personal or organizational documents, allowing users to obtain relevant documents from data sources (such as files, notes, or emails) by asking questions or expressing needs in natural language. Enterprises can use this plug-in to provide internal documents to employees through ChatGPT. The ChatGPT retrieval plug-in uses OpenAI's text-embedding-ada-002 embedding model to generate embedded vectors of document blocks, and uses the vector database to store and query the embedded vectors at the back end. As an open-source and self-hosted solution, developers can deploy their own retrieval plugin and register it with ChatGPT, the retrieval plugin supports multiple vector database providers, allowing developers to choose a vector database from the list.

The FastAPI server deploys endpoints for retrieval plugins for updating, querying, and deleting documents. Users can use metadata filters to refine search results based on source, date, author, or other criteria. The retrieval plugin can be hosted on any cloud platform that supports Docker containers, such as Fly.io, Heroku, Render or Azure Container Apps. To keep the vector database updated with the latest documents, the retrieval plugin can continuously process and store documents from various data sources, using incoming webhooks to upsert and delete endpoints. Tools like Zapier or Make can help configure webhooks based on events or schedules. A notable feature of the retrieval plugin is that it provides in-memory memory capabilities for ChatGPT. By utilizing the plugin's upsert endpoint, ChatGPT can save snippets of conversations into a vector database for later reference, allowing ChatGPT to remember and retrieve information from previous conversations, a feature that helps improve context-aware chat experiences. The Retrieval plugin allows ChatGPT to search a vector database of content and then add the results to the ChatGPT session, different authentication methods can be selected to secure the plugin.

The retrieval plugin is built using FastAPI, a web framework for building APIs in Python, FastAPI makes it easy to develop, validate and document API endpoints. One of the benefits of using FastAPI is the automatic generation of interactive API documentation using Swagger UI. When the API is running locally, you can use the Swagger UI at <local_host_url ie http://0.0.0.0:8000>/docs to interact with the API endpoints, test their functionality, and see the expected request and response patterns.

The retrieval plugin exposes the following endpoints for inserting, querying and deleting documents from the vector database, all requests and responses are in JSON format and require a valid bearer token as an authorization header.

  • /upsert: Allows to upload one or more documents and store their text and metadata in the vector database. Documents are divided into chunks of about 200 tokens, each token has a unique ID. In the request body there is a list of documents, each document has a text field, and optional ID and metadata. Metadata can contain the following optional fields: source, source_id, url, created_at, and author. The endpoint returns a list of IDs for inserted documents, or generated IDs if no IDs were initially provided.

  • /upsert-file: Allows to upload a single file, for example: PDF, TXT, DOCX, PPTX or MD, and stores the text and metadata in a vector database. The file is converted to plain text and split into chunks of about 200 tokens, each with a unique ID. The endpoint returns a list containing the IDs generated by the inserted files.

  • /query: Allows querying the vector database with one or more natural language queries and optional metadata, filters. The endpoint expects a list of queries in the request body, each with a query and optional filters and top_k text. Filters include: source, source_id, document_id, url, created_at, author, etc. The top_k field specifies how many results to return for a given query, the default value is 3. The endpoint returns a list of objects, each containing a list of the most relevant document chunks for a given query, along with their text, metadata, and similarity scores.

  • /delete: Allows to delete one or more documents from the vector database using ID, metadata filter or delete_all flag. The endpoint expects at least one of the following parameters in the request body: ids, filter, or delete_all. The ids parameter is a list of document ids to delete, and all document blocks with documents of these IDS will be deleted. The filter parameters contain the following: source, source_id, document_id, url, created_at, and author. The delete_all parameter is a boolean indicating whether to delete all documents from the vector database. The endpoint returns a boolean indicating whether the deletion was successful.

The OpenAPI schema only contains the /query endpoint because that is the only function ChatGPT needs to access. This way, ChatGPT can only use the plugin to retrieve relevant documents based on natural language queries. However, if developers also want ChatGPT to remember things for later, the /upsert endpoint can be used to save snippets of the conversation into the vector database.

d60d456bb2e4f34ee80458beb7080b86.png

f2f1637c35f937db065ccbca5e316c54.png

4a4a16df434ef2d8b1fe5a5845371623.png

Guess you like

Origin blog.csdn.net/duan_zhihua/article/details/131507396