OpenAI GPT-4 Code Interpreter test

OpenAI GPT-4 Beta version Code Interpreter function analysis

OpenAI recently launched a beta version of the Code Interpreter feature in GPT-4, a version of ChatGPT that can write and execute Python code and handle file uploads. Here's a basic analysis of its performance.

GPT-4 Code Interpreter

The main function

  1. File information acquisition : Code Interpreter can obtain relevant information from the file name and use the generated Python code to process the provided file type. For example, PDF files will be parsed as text, while PNG images will be compressed and input (the specific format of the input is currently unclear).
  2. Python code generation : Code Interpreter will generate corresponding code according to the type of input file. The output includes STDOUT and STDERR, as well as the processing result RESULT. These contents will be displayed collapsed.
  3. Content processing beyond the Token Limit : Code Interpreter uses the generated external tool to retrieve and extract the content required by the user. This content will be used as input, and the remaining content will be cached as files and will not be read directly.

function test

The Code Interpreter function was tested for different file types.

chart(png)

Code Interpreter obtains relevant information from the file name, and there may be system prompts to help the model notice the file name. However, during the process of image processing, the image is compressed, which may cause the image content of complex visual tables to not be read correctly.
PNG Test

long text (pdf)

For PDF files, Code Interpreter generates simple code and executes it, such as using PyPDF2 to process PDF files, and outputs the processed content, including STDOUT, STDERR and the result RESULT.

A later paragraph (exceeding the token limit) was selected to check the token allocation of GPT-4. Test results show that GPT-4 does not directly read the entire file, but guides the user to select a portion of the content. Then, GPT-4 will generate an external tool, use this tool to retrieve and extract user-specified content, pass the result into the model as Context, and process it in conjunction with the user's Prompt.

PDF Test
PDF Test - Detailed

shortcode (ipynb)

For short code text, GPT-4's Code Interpreter can generate a simple parsing tool to obtain the text and input the text as a RESULT into the model.
ipynb Test

Long code (C++)

However, for long text codes whose length exceeds the maximum number of Tokens, GPT-4's Code Interpreter fails to correctly output the complete code, and only outputs a part of it, and loads this part of the code into the model as Context.

This shows that GPT-4’s Code Interpreter function still has limitations for texts whose length exceeds the Token Limit.
Long Code Test
Long Code Test - Result

Guess you like

Origin blog.csdn.net/m0_56661101/article/details/131654361