In today's society, the development of AI technology has entered an explosive period, and the most eye-catching one is the GPT (Generative Pre-trained Transformer) model. This model is amazing with its powerful generation ability and intelligence. The birth of ChatGPT, It also means a huge progress in AI from artificial retardation to intelligence. With the opening of the GPT-3.5 interface, more and more developers can use this interface to carry out innovative applications in natural language processing and text generation. This article will introduce the basic features, application scenarios and related technical details of the GPT-3.5 interface, aiming to allow readers to better understand the powerful functions of the GPT-3.5 interface. This article was written with the assistance of chatGPT.
What is GPT and chatGPT?
GPT (Generative Pre-trained Transformer) is a natural language processing model based on the Transformer architecture, developed by OpenAI. Its core idea is to conduct unsupervised pre-training learning on a large-scale text corpus to build a general language understanding model. This model can be used for various downstream tasks such as text classification, summary generation, dialogue systems, etc.
ChatGPT is a specific application based on the GPT model, which focuses on the field of dialogue systems. Through large-scale pre-training, ChatGPT can automatically generate replies that match the user's conversation content after inputting the conversation text. Different from traditional dialogue systems, ChatGPT does not need to design specific rules and logic, nor does it need to label data, because it has trained models based on massive data and can automatically learn various laws and logical relationships of language. . Through continuous interaction with users, ChatGPT can gradually deepen its understanding of user intentions and needs, thereby answering user questions more accurately.
In general, GPT and ChatGPT are new natural language processing technologies that use big data and advanced algorithms to achieve natural, smooth, and intelligent natural language communication. Both technologies have broad application prospects in various fields, including customer service robots, smart assistants, smart homes, and more.
GPT and Open API
Two common core parameters (important)
temperature
Used to adjust the diversity and randomness of text language in the generative model. It rescales the probability distribution of each word in the model at each step, making common words more likely to be chosen, while also making less common words more likely to appear. Higher temperature values generally produce bolder, more creative text, while lower temperature values produce more conservative text based on what the model has already learned. Normally, the value range of temperature is between 0 and 2. The larger the value, the more random the result is generated, and the smaller the value, the more conservative the result is generated.
top_p
Used to prune probabilities in generative models. It computes a cumulative probability distribution and stops as soon as the distribution exceeds the top_p value. For example, a top_p of 0.3 means that only the top 30% of the markers in the total probability distribution are considered. This can avoid outputting too random and inappropriate text and improve the rationality and controllability of the output.
1、chat
Address: api.openai.com/v1/chat/com…[1]
Method: POST
This API is mainly used for conversations. Due to well-known reasons, we cannot use the ChatGPT service provided by the OpenAI official website, but we can still use the domestic version of ChatGPT service on many domestic websites. This is actually based on The service, request parameters and response list built by this API are as follows: Of course, the following is the parameter list of the API in https://platform.openai.com/docs/api-reference/chat
:chat
Request parameters
parameter name | type | Is it required? | Parameter Description |
---|---|---|---|
model |
string | yes | The model name to use, as of today supported models are gpt-4, gpt-4-0314, gpt-4-32k, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-0301 , please note that 4 models need to apply for permissions |
messages |
array | yes | Conversation starting text |
temperature |
float | no | A measure of randomness in generated text. The higher the value, the more random the results will be generated |
max_tokens |
int | no | Generate the maximum number of tokens in the text |
n |
int | no | The amount of text generated |
stop |
string | no | The flag of generated text. When this flag appears in the generated text, the generation will be automatically stopped. |
The message parameter is the most important content. Let’s take a look at the case on the official website.
[
{
"role":"system",
"content":"You are a helpful assistant."
},
{
"role":"user",
"content":"Who won the world series in 2020?"
},
{
"role":"assistant",
"content":"The Los Angeles Dodgers won the World Series in 2020."
}
]
From the case, we can see that the essence of message is an array composed of role and content fields. The role field has three values, namely system, user and assistant. System represents the role setting of the AI in the current conversation. In the case, it means that the AI plays a very useful assistant. User represents the user's question. Assistant , represents the content of the assistant’s answer.
You may want to ask why a message needs to carry so much content. Here we recall a scene. When we use chatGPT, it can remember the context. The secret is here. In fact, on the server side of ChatGPT, every message is not recorded. Instead of the content of the answer, it will bring the context every time. This is why it can remember the context. There is also a special pitfall here. The context will also be recorded in the number of tokens. A context that is too long will cause the cost to rise. Regarding fees, please refer to the content later in the article.
response
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "\n\nHello there, how may I assist you today?",
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
2、completion
Address: api.openai.com/v1/completi…[2]
Method: POST
Complete tasks according to user prompts, mainly text-type tasks. The difference between this and the chatAPI above is that different models can be used. Different models have different capabilities. The following is a list of request parameters.
parameter name | type | Is it necessary | illustrate |
---|---|---|---|
model | string | yes | Models to be used. The models supported here are text-davinci-003, text-davinci-002, text-curie-001, text-babbage-001, text-ada-001. In addition to these, you can also use your own fine. -tunes model |
prompt | string | yes | The starting text that needs to be generated |
max_tokens | int | no | The maximum length of generated text. Default is 16, maximum value is 2048. |
temperature | float | no | Controls how random the generated text is. The default is 1, the higher the value, the more random the text generated. |
top_p | float | no | Similar to temperature, controls how random the generated text is. The default is 1, the higher the value, the more random the text generated. |
n | int | no | Specify the amount of text to be generated. Default is 1. |
stream | bool | no | Whether to stream the generated text back. Default is False. |
stop | list of string | no | Stops generating a list of tagged symbols for text. |
presence_penalty | float | no | Controls how repetitive the generated text is. Default is 0. |
frequency_penalty | float | no | Controls the frequency of specific words (such as people's names) in generated text. Default is 0. |
best_of | int | no | After calling the API multiple times, the best n results will be taken. Default is 1. |
logprobs | int | no | Controls the log probability required to be returned for each marked symbol. Default is None. |
echo | bool | no | Whether to add starting text to the generated text. Default is True. |
response
{
"id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
"object": "text_completion",
"created": 1589478378,
"model": "text-davinci-003",
"choices": [
{
"text": "\n\nThis is indeed a test",
"index": 0,
"logprobs": null,
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 5,
"completion_tokens": 7,
"total_tokens": 12
}
}
3、edits
Address: api.openai.com/v1/edits[3]
Method: POST
This API is usually used for text error correction and polishing. The parameter list is as follows:
parameter name | type | Is it necessary | describe |
---|---|---|---|
model |
string | yes | The models supported here aretext-davinci-edit-001 or code-davinci-edit-001 |
input |
string | no | An optional explicit hint that will direct the API to generate a specific type of text. |
instruction |
string | no | Instructions refer to what you ask AI to do |
temperature |
float | no | A value that controls the randomness generated in this article. Higher temperatures result in more unexpected text. The default value is 1. |
top_p |
int | no | The maximum number of tokens to generate text. Note that each token may contain multiple words. |
n |
int | no | Generate the number of alternative texts returned. Default is 1. |
response
{
"object": "edit",
"created": 1589478378,
"choices": [
{
"text": "What day of the week is it?",
"index": 0,
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 32,
"total_tokens": 57
}
}
4、images
(1) Generate Image
Address: api.openai.com/v1/images/g…[4]
Method: POST
parameter list
parameter name | type | Is it necessary | describe |
---|---|---|---|
prompt |
string | yes | Tip, will guide the API to generate a specific type of image. |
n |
int | no | The number of generated images (maximum 10). The default is 1. |
size |
string | no | The size of the image to generate, for example: "1024x1024". The default value is "512x512". |
response_format |
string | no | 服务器返回文件的格式。可以是:“url”(以URL的形式返回)或“base64”(以base64编码的形式返回)。默认是“url”。 |
user |
int | 否 | 用户唯一标识符,用的很少 |
response
{
"created": 1589478378,
"data": [
{
"url": "https://..."
},
{
"url": "https://..."
}
]
}
(2)image edit
根据提示编辑图片
地址: api.openai.com/v1/images/e…[5]
方法: POST
参数列表
参数名 | 类型 | 是否必须 | 描述 |
---|---|---|---|
image |
string | 是 | 包含图像数据的对象。目前支持PNG格式,图片必须小于4M |
mask |
string | 否 | 另外一张图片,根据官方解释,这里是指定image中哪些地方呗编辑的(因为不搞图片处理,作者也不太明白的,懂的可以在评论区讨论下)。 |
prompt |
string | 是 | 一个可选的显式提示,将引导API生成特定类型的图像。 |
n |
int | 否 | 生成图片的数量(最大为10)。默认是1。 |
size |
string | 否 | 要生成的图像的大小,例如:“1024x1024”。默认值是“512x512”。 |
response_format |
string | 否 | 服务器返回文件的格式。可以是:“url”(以URL的形式返回)或“base64”(以base64编码的形式返回)。默认是“url”。 |
user |
int | 否 | 用户唯一标识符,用的很少 |
响应同上
5、embeddings
根据给定的输入,计算出一个向量,常被用于搜索,聚合,分类等领域
地址: api.openai.com/v1/embeddin…[6]
方法: POST
参数列表:
参数名 | 类型 | 是否必须 | 描述 |
---|---|---|---|
model |
string | 是 | 这里支持的模型有text-davinci-edit-001 or code-davinci-edit-001 |
input |
string | 否 | 一个可选的显式提示,将引导API生成特定类型的文本。 |
响应
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [
0.0023064255,
-0.009327292,
.... (1536 floats total for ada-002)
-0.0028842222,
], // 向量
"index": 0
}
],
"model": "text-embedding-ada-002",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}
官方推荐的计算相似度的函数是余弦相似度函数,比如说用在搜索时,我们先将被搜索的数据都求出上面的向量值,然后再把关键词求出向量,再使用余弦相关函数来对比,如果是专业的搜索,推荐使用向量检索数据库来做。
6、fine-tunes
fine-tunes是深度学习中的一个术语,指的是在已经训练好的模型上,使用新的训练集进行微调,以使其适应新的任务或数据集。这个过程通常会调整模型的参数(例如权重和偏差),以使其能够更好地拟合新的数据,并提高整体性能。这里的API较多,这里只列最重要的两个
(1)上传文件
地址:api.openai.com/v1/files[7]
方法: POST
方法只有两个参数,一个是file,文件格式,另外一个是purpose,字符串类型,代表上传的目的是什么。我们来看一组request和response的示例。 request
{
"file": "mydata.jsonl",
"purpose": "create fine-tunes"
}
response
{
"id": "file-XjGxS3KTG0uNmNOK362iJua3",
"object": "file",
"bytes": 140,
"created_at": 1613779121,
"filename": "mydata.jsonl",
"purpose": "fine-tune"
}
(2)创建fine-tunes
地址:api.openai.com/v1/fine-tun…[8]
方法: POST
参数列表(仅列出常用且重要的)
参数名 | 类型 | 是否必须 | 描述 |
---|---|---|---|
model |
string | 是 | 基于哪个模型去做微调,这里仅支持"ada", "babbage", "curie", "davinci"这四种以及在2022-04-21之后fine-tunes的模型 |
training_file |
string | 否 | 文件ID,即上个API返回值中的ID |
注意,fine-tunes模型的创建,需要一定的时间,一般来说在半个小时以上。
响应
{
"id": "ft-AF1WoRqd3aJAHsqc9NY7iL8F",
"object": "fine-tune",
"model": "curie",
"created_at": 1614807352,
"events": [
{
"object": "fine-tune-event",
"created_at": 1614807352,
"level": "info",
"message": "Job enqueued. Waiting for jobs ahead to complete. Queue number: 0."
}
],
"fine_tuned_model": null,
"hyperparams": {
"batch_size": 4,
"learning_rate_multiplier": 0.1,
"n_epochs": 4,
"prompt_loss_weight": 0.1,
},
"organization_id": "org-...",
"result_files": [],
"status": "pending",
"validation_files": [],
"training_files": [
{
"id": "file-XGinujblHPwGLSztz8cPS8XY",
"object": "file",
"bytes": 1547276,
"created_at": 1610062281,
"filename": "my-data-train.jsonl",
"purpose": "fine-tune-train"
}
],
"updated_at": 1614807352,
}
目前支持的模型
我们可以调用api.openai.com/v1/models[9] 来查看自己可以使用哪些model。下面是各个模型的详细介绍
GPT3.5
模型名称 | 描述 | 最大 token 数 | 训练数据 |
---|---|---|---|
gpt-3.5-turbo | GPT-3.5 家族中最具能力且成本效益最高的模型,针对聊天进行了优化,也适用于传统的语言生成任务。 | 4,096 | 截至2021年9月 |
gpt-3.5-turbo-0301 | 基于 gpt-3.5-turbo 的快照,适用于三个月的支持期限,并不会再更新。 | 4,096 | 截至2021年9月 |
text-davinci-003 | 可以完成任何语言任务,比 curie、babbage 或 ada 模型具有更好的质量、更长的输出和一致的指令跟随,支持在文本中插入自动完成功能。 | 4,097 | 截至2021年6月 |
text-davinci-002 | 具有类似于 text-davinci-003 的功能,但经过有监督的微调而非强化学习进行训练。 | 4,097 | 截至2021年6月 |
code-davinci-002 | 针对代码补全任务进行了优化的模型。 | 8,001 | 截至2021年6月 |
其中,以上模型都是 GPT-3.5 模型,能够理解和生成自然语言或代码。其中 gpt-3.5-turbo 是最具能力且成本效益最高的模型,适用于聊天和传统文本生成任务。text-davinci-003可以完成任何语言任务,并具有更好的质量、更长的输出和一致的指令跟随,同时支持在文本中插入自动完成功能。另外,code-davinci-002这个模型则是针对代码补全任务进行了优化。
GPT4
模型名称 | 描述 | 最大 token 数 | 训练数据 |
---|---|---|---|
gpt-4 | 目前最新的 GPT 模型,支持文本生成和处理。相较于其它 GPT-3.5 模型,GPT-4 在更复杂的推理场景下表现出更高的准确性,并且在聊天优化方面也进行了改进。 | 8,192 | 截至2021年9月 |
gpt-4-0314 | 基于 GPT-4 的快照,不会再更新,并只支持三个月的期限,将于2023年6月14日结束。 | 8,192 | 截至2021年9月 |
gpt-4-32k | 能力与基础的 GPT-4 模型相同,但上下文长度为其 4 倍。将根据最新的模型迭代进行更新。 | 32,768 | 截至2021年9月 |
gpt-4-32k-0314 | 基于 gpt-4-32k 的快照,将不会再更新,并只支持三个月的期限,将于2023年6月14日结束。 | 32,768 | 截至2021年9月 |
虽然对于一些基础的任务而言,GPT-4 与 GPT-3.5 模型之间的差异并不明显,但在更复杂的推理场景下,GPT-4 显得更加强大。需要注意的是,GPT-4 目前处于限制性测试阶段,只有被授权的用户才能够使用。
API Key的申请以及收费
上面每条API都需要通过ApI Key来调用,API key可以在platform.openai.com/account/api…[10] 申请,至于token的计算,可以在OpenAI官网platform.openai.com/tokenizer[11] 来计算。token收费可以参考参考官方这个页面openai.com/pricing[12] 来看。