Regarding GPT’s Open API, just read this article (teaching you how to build it)

In today's society, the development of AI technology has entered an explosive period, and the most eye-catching one is the GPT (Generative Pre-trained Transformer) model. This model is amazing with its powerful generation ability and intelligence. The birth of ChatGPT, It also means a huge progress in AI from artificial retardation to intelligence. With the opening of the GPT-3.5 interface, more and more developers can use this interface to carry out innovative applications in natural language processing and text generation. This article will introduce the basic features, application scenarios and related technical details of the GPT-3.5 interface, aiming to allow readers to better understand the powerful functions of the GPT-3.5 interface. This article was written with the assistance of chatGPT.

What is GPT and chatGPT?

GPT (Generative Pre-trained Transformer) is a natural language processing model based on the Transformer architecture, developed by OpenAI. Its core idea is to conduct unsupervised pre-training learning on a large-scale text corpus to build a general language understanding model. This model can be used for various downstream tasks such as text classification, summary generation, dialogue systems, etc.

ChatGPT is a specific application based on the GPT model, which focuses on the field of dialogue systems. Through large-scale pre-training, ChatGPT can automatically generate replies that match the user's conversation content after inputting the conversation text. Different from traditional dialogue systems, ChatGPT does not need to design specific rules and logic, nor does it need to label data, because it has trained models based on massive data and can automatically learn various laws and logical relationships of language. . Through continuous interaction with users, ChatGPT can gradually deepen its understanding of user intentions and needs, thereby answering user questions more accurately.

In general, GPT and ChatGPT are new natural language processing technologies that use big data and advanced algorithms to achieve natural, smooth, and intelligent natural language communication. Both technologies have broad application prospects in various fields, including customer service robots, smart assistants, smart homes, and more.

GPT and Open API

Two common core parameters (important)
temperature

Used to adjust the diversity and randomness of text language in the generative model. It rescales the probability distribution of each word in the model at each step, making common words more likely to be chosen, while also making less common words more likely to appear. Higher temperature values ​​generally produce bolder, more creative text, while lower temperature values ​​produce more conservative text based on what the model has already learned. Normally, the value range of temperature is between 0 and 2. The larger the value, the more random the result is generated, and the smaller the value, the more conservative the result is generated.

top_p

Used to prune probabilities in generative models. It computes a cumulative probability distribution and stops as soon as the distribution exceeds the top_p value. For example, a top_p of 0.3 means that only the top 30% of the markers in the total probability distribution are considered. This can avoid outputting too random and inappropriate text and improve the rationality and controllability of the output.

1、chat

Address: api.openai.com/v1/chat/com…[1]

Method: POST

This API is mainly used for conversations. Due to well-known reasons, we cannot use the ChatGPT service provided by the OpenAI official website, but we can still use the domestic version of ChatGPT service on many domestic websites. This is actually based on The service, request parameters and response list built by this API are as follows: Of course, the following is the parameter list of the API in https://platform.openai.com/docs/api-reference/chat:chat

Request parameters
parameter name type Is it required? Parameter Description
model string yes The model name to use, as of today supported models are gpt-4, gpt-4-0314, gpt-4-32k, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-0301 , please note that 4 models need to apply for permissions
messages array yes Conversation starting text
temperature float no A measure of randomness in generated text. The higher the value, the more random the results will be generated
max_tokens int no Generate the maximum number of tokens in the text
n int no The amount of text generated
stop string no The flag of generated text. When this flag appears in the generated text, the generation will be automatically stopped.

The message parameter is the most important content. Let’s take a look at the case on the official website.

[
    {
        "role":"system",
        "content":"You are a helpful assistant."
    },
    {
        "role":"user",
        "content":"Who won the world series in 2020?"
    },
    {
        "role":"assistant",
        "content":"The Los Angeles Dodgers won the World Series in 2020."
    }
]


From the case, we can see that the essence of message is an array composed of role and content fields. The role field has three values, namely system, user and assistant. System represents the role setting of the AI ​​in the current conversation. In the case, it means that the AI ​​plays a very useful assistant. User represents the user's question. Assistant , represents the content of the assistant’s answer.

You may want to ask why a message needs to carry so much content. Here we recall a scene. When we use chatGPT, it can remember the context. The secret is here. In fact, on the server side of ChatGPT, every message is not recorded. Instead of the content of the answer, it will bring the context every time. This is why it can remember the context. There is also a special pitfall here. The context will also be recorded in the number of tokens. A context that is too long will cause the cost to rise. Regarding fees, please refer to the content later in the article.

response

{
  "id": "chatcmpl-123", 
  "object": "chat.completion",
  "created": 1677652288,
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",   
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,  
    "completion_tokens": 12, 
    "total_tokens": 21  
  }
}

2、completion

Address: api.openai.com/v1/completi…[2]

Method: POST

Complete tasks according to user prompts, mainly text-type tasks. The difference between this and the chatAPI above is that different models can be used. Different models have different capabilities. The following is a list of request parameters.

parameter name type Is it necessary illustrate
model string yes Models to be used. The models supported here are text-davinci-003, text-davinci-002, text-curie-001, text-babbage-001, text-ada-001. In addition to these, you can also use your own fine. -tunes model
prompt string yes The starting text that needs to be generated
max_tokens int no The maximum length of generated text. Default is 16, maximum value is 2048.
temperature float no Controls how random the generated text is. The default is 1, the higher the value, the more random the text generated.
top_p float no Similar to temperature, controls how random the generated text is. The default is 1, the higher the value, the more random the text generated.
n int no Specify the amount of text to be generated. Default is 1.
stream bool no Whether to stream the generated text back. Default is False.
stop list of string no Stops generating a list of tagged symbols for text.
presence_penalty float no Controls how repetitive the generated text is. Default is 0.
frequency_penalty float no Controls the frequency of specific words (such as people's names) in generated text. Default is 0.
best_of int no After calling the API multiple times, the best n results will be taken. Default is 1.
logprobs int no Controls the log probability required to be returned for each marked symbol. Default is None.
echo bool no Whether to add starting text to the generated text. Default is True.

response

{
  "id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
  "object": "text_completion",
  "created": 1589478378,
  "model": "text-davinci-003",
  "choices": [
    {
      "text": "\n\nThis is indeed a test",  
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 5, 
    "completion_tokens": 7,
    "total_tokens": 12
  }
}

3、edits

Address: api.openai.com/v1/edits[3]

Method: POST

This API is usually used for text error correction and polishing. The parameter list is as follows:

parameter name type Is it necessary describe
model string yes The models supported here aretext-davinci-edit-001 or code-davinci-edit-001
input string no An optional explicit hint that will direct the API to generate a specific type of text.
instruction string no Instructions refer to what you ask AI to do
temperature float no A value that controls the randomness generated in this article. Higher temperatures result in more unexpected text. The default value is 1.
top_p int no The maximum number of tokens to generate text. Note that each token may contain multiple words.
n int no Generate the number of alternative texts returned. Default is 1.

response

{
  "object": "edit",
  "created": 1589478378,
  "choices": [
    {
      "text": "What day of the week is it?", 
      "index": 0,
    }
  ],
  "usage": {
    "prompt_tokens": 25,            
    "completion_tokens": 32,
    "total_tokens": 57
  }
}

4、images

(1) Generate Image

Address: api.openai.com/v1/images/g…[4]

Method: POST

parameter list

parameter name type Is it necessary describe
prompt string yes Tip, will guide the API to generate a specific type of image.
n int no The number of generated images (maximum 10). The default is 1.
size string no The size of the image to generate, for example: "1024x1024". The default value is "512x512".
response_format string no 服务器返回文件的格式。可以是:“url”(以URL的形式返回)或“base64”(以base64编码的形式返回)。默认是“url”。
user int 用户唯一标识符,用的很少

response

{
  "created": 1589478378,
  "data": [
    {
      "url": "https://..." 
    },
    {
      "url": "https://..."
    }
  ]
}

(2)image edit

根据提示编辑图片

地址: api.openai.com/v1/images/e…[5]

方法: POST

参数列表

参数名 类型 是否必须 描述
image string 包含图像数据的对象。目前支持PNG格式,图片必须小于4M
mask string 另外一张图片,根据官方解释,这里是指定image中哪些地方呗编辑的(因为不搞图片处理,作者也不太明白的,懂的可以在评论区讨论下)。
prompt string 一个可选的显式提示,将引导API生成特定类型的图像。
n int 生成图片的数量(最大为10)。默认是1。
size string 要生成的图像的大小,例如:“1024x1024”。默认值是“512x512”。
response_format string 服务器返回文件的格式。可以是:“url”(以URL的形式返回)或“base64”(以base64编码的形式返回)。默认是“url”。
user int 用户唯一标识符,用的很少

响应同上

5、embeddings

根据给定的输入,计算出一个向量,常被用于搜索,聚合,分类等领域

地址: api.openai.com/v1/embeddin…[6]

方法: POST

参数列表:

参数名 类型 是否必须 描述
model string 这里支持的模型有text-davinci-edit-001 or code-davinci-edit-001
input string 一个可选的显式提示,将引导API生成特定类型的文本。

响应

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        0.0023064255,
        -0.009327292,
        .... (1536 floats total for ada-002)
        -0.0028842222,     
      ], // 向量
      "index": 0
    }
  ],
  "model": "text-embedding-ada-002",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

官方推荐的计算相似度的函数是余弦相似度函数,比如说用在搜索时,我们先将被搜索的数据都求出上面的向量值,然后再把关键词求出向量,再使用余弦相关函数来对比,如果是专业的搜索,推荐使用向量检索数据库来做。

6、fine-tunes

fine-tunes是深度学习中的一个术语,指的是在已经训练好的模型上,使用新的训练集进行微调,以使其适应新的任务或数据集。这个过程通常会调整模型的参数(例如权重和偏差),以使其能够更好地拟合新的数据,并提高整体性能。这里的API较多,这里只列最重要的两个

(1)上传文件

地址:api.openai.com/v1/files[7]

方法: POST

方法只有两个参数,一个是file,文件格式,另外一个是purpose,字符串类型,代表上传的目的是什么。我们来看一组request和response的示例。 request

{
  "file": "mydata.jsonl",   
  "purpose": "create fine-tunes"
}

response

{
  "id": "file-XjGxS3KTG0uNmNOK362iJua3", 
  "object": "file",
  "bytes": 140,
  "created_at": 1613779121,
  "filename": "mydata.jsonl",
  "purpose": "fine-tune"
}

(2)创建fine-tunes

地址:api.openai.com/v1/fine-tun…[8]

方法: POST

参数列表(仅列出常用且重要的)

参数名 类型 是否必须 描述
model string 基于哪个模型去做微调,这里仅支持"ada", "babbage", "curie", "davinci"这四种以及在2022-04-21之后fine-tunes的模型
training_file string 文件ID,即上个API返回值中的ID

注意,fine-tunes模型的创建,需要一定的时间,一般来说在半个小时以上。

响应

{
  "id": "ft-AF1WoRqd3aJAHsqc9NY7iL8F",
  "object": "fine-tune",
  "model": "curie",
  "created_at": 1614807352,
  "events": [ 
    {
      "object": "fine-tune-event",
      "created_at": 1614807352,
      "level": "info",
      "message": "Job enqueued. Waiting for jobs ahead to complete. Queue number: 0."
    }
  ],
  "fine_tuned_model": null,
  "hyperparams": {
    "batch_size": 4,
    "learning_rate_multiplier": 0.1,
    "n_epochs": 4,
    "prompt_loss_weight": 0.1,
  },
  "organization_id": "org-...",
  "result_files": [],
  "status": "pending",
  "validation_files": [],
  "training_files": [
    {
      "id": "file-XGinujblHPwGLSztz8cPS8XY",
      "object": "file",
      "bytes": 1547276,
      "created_at": 1610062281,
      "filename": "my-data-train.jsonl",
      "purpose": "fine-tune-train"
    }
  ],
  "updated_at": 1614807352,
}

目前支持的模型

我们可以调用api.openai.com/v1/models[9] 来查看自己可以使用哪些model。下面是各个模型的详细介绍

GPT3.5
模型名称 描述 最大 token 数 训练数据
gpt-3.5-turbo GPT-3.5 家族中最具能力且成本效益最高的模型,针对聊天进行了优化,也适用于传统的语言生成任务。 4,096 截至2021年9月
gpt-3.5-turbo-0301 基于 gpt-3.5-turbo 的快照,适用于三个月的支持期限,并不会再更新。 4,096 截至2021年9月
text-davinci-003 可以完成任何语言任务,比 curie、babbage 或 ada 模型具有更好的质量、更长的输出和一致的指令跟随,支持在文本中插入自动完成功能。 4,097 截至2021年6月
text-davinci-002 具有类似于 text-davinci-003 的功能,但经过有监督的微调而非强化学习进行训练。 4,097 截至2021年6月
code-davinci-002 针对代码补全任务进行了优化的模型。 8,001 截至2021年6月

其中,以上模型都是 GPT-3.5 模型,能够理解和生成自然语言或代码。其中 gpt-3.5-turbo 是最具能力且成本效益最高的模型,适用于聊天和传统文本生成任务。text-davinci-003可以完成任何语言任务,并具有更好的质量、更长的输出和一致的指令跟随,同时支持在文本中插入自动完成功能。另外,code-davinci-002这个模型则是针对代码补全任务进行了优化。

GPT4
模型名称 描述 最大 token 数 训练数据
gpt-4 目前最新的 GPT 模型,支持文本生成和处理。相较于其它 GPT-3.5 模型,GPT-4 在更复杂的推理场景下表现出更高的准确性,并且在聊天优化方面也进行了改进。 8,192 截至2021年9月
gpt-4-0314 基于 GPT-4 的快照,不会再更新,并只支持三个月的期限,将于2023年6月14日结束。 8,192 截至2021年9月
gpt-4-32k 能力与基础的 GPT-4 模型相同,但上下文长度为其 4 倍。将根据最新的模型迭代进行更新。 32,768 截至2021年9月
gpt-4-32k-0314 基于 gpt-4-32k 的快照,将不会再更新,并只支持三个月的期限,将于2023年6月14日结束。 32,768 截至2021年9月

虽然对于一些基础的任务而言,GPT-4 与 GPT-3.5 模型之间的差异并不明显,但在更复杂的推理场景下,GPT-4 显得更加强大。需要注意的是,GPT-4 目前处于限制性测试阶段,只有被授权的用户才能够使用。

API Key的申请以及收费

上面每条API都需要通过ApI Key来调用,API key可以在platform.openai.com/account/api…[10] 申请,至于token的计算,可以在OpenAI官网platform.openai.com/tokenizer[11] 来计算。token收费可以参考参考官方这个页面openai.com/pricing[12] 来看。

Guess you like

Origin blog.csdn.net/weixin_54542328/article/details/134930864