Technology Sharing | ChatGPT API call total timeout? Here’s the idea for solving the problem

Problem recurrence

When calling the ChatGPT API and using streaming output, we often encounter timeouts caused by network problems. Interestingly, the author found that the timeout encountered during local debugging will automatically recover after 10 minutes (why 10 minutes? We will explain it later), but after waiting for a while on the server, it will fail and report a timeout exception (error Code 502).

The author believes that the reason for local recovery may be automatic retry, but the retry time is a bit long (ChatGPT API does not have a retry function, which was added by the project). The server returns "502" because content returned from the background to the front end needs to go through the gateway layer, and the gateway layer's timeout verification time is shorter than the automatic retry time (10 minutes), so a timeout exception will be reported if the retry cannot be sustained. .

Based on the above scenario, this article sets out to solve the ChatGPT API call timeout problem.

Optimization demands

Do not display timeout error messages to users.
Shorten the interval between retries after a timeout.

Solutions

The author considered two options.

The first is to completely solve the network problem, but it is a bit difficult. This is an OpenAI server problem, and even servers deployed abroad will time out.

The second is to use automatic retry to solve the problem. By adjusting the timeout time and improving the response speed, the solution is feasible.

Implement the solution

During the solution process, the author adjusted the timeout in two steps from shallow to deep; if you want to know the final solution directly, please move to "Solution 2" ~

Operating environment:

Python: 3.10.7

openai: 0.27.6

Calling method:

openai.api_resources.chat_completion.ChatCompletion.acreate

(This is how ChatGPT is called asynchronously.)

Method call link:

The timeout parameter ClientTimeouthas a total of 4 attributes total, connect, sock_readand sock_connect.

# 方法 -> 超时相关参数
openai.api_resources.chat_completion.ChatCompletion.acreate -> kwargs
openai.api_resources.abstract.engine_api_resource.EngineAPIResource.acreate -> params
openai.api_requestor.APIRequestor.arequest -> request_timeout
# request_timeout 在这一步变成了 timeout，因此，只需要传参 request_timeout 即可
openai.api_requestor.APIRequestor.arequest_raw -> request_timeout
aiohttp.client.ClientSession.request -> kwargs
aiohttp.client.ClientSession._request -> timeout
    tm = TimeoutHandle(self._loop, real_timeout.total) -> ClientTimeout.total
    async with ceil_timeout(real_timeout.connect): -> ClientTimeout.connect
# 子分支1
aiohttp.connector.BaseConnector.connect -> timeout
aiohttp.connector.TCPConnector._create_connection -> timeout
aiohttp.connector.TCPConnector._create_direct_connection -> timeout
aiohttp.connector.TCPConnector._wrap_create_connection -> timeout
    async with ceil_timeout(timeout.sock_connect): -> ClientTimeout.sock_connect
# 子分支2
aiohttp.client_reqrep.ClientRequest.send -> timeout
aiohttp.client_proto.ResponseHandler.set_response_params -> read_timeout
aiohttp.client_proto.ResponseHandler._reschedule_timeout -> self._read_timeout
    if timeout:
    self._read_timeout_handle = self._loop.call_later(
        timeout, self._on_read_timeout
    ) -> ClientTimeout.sock_read

Solution one

openai.api_requestor.APIRequestor.arequest_rawParameters in methods request_timeoutcan be passed connectand totalparameters.

So it can openai.api_resources.chat_completion.ChatCompletion.acreatebe set when calling request_time(10, 300).

#
async def arequest_raw(
    self,
    method,
    url,
    session,
    *,
    params=None,
    supplied_headers: Optional[Dict[str, str]] = None,
    files=None,
    request_id: Optional[str] = None,
    request_timeout: Optional[Union[float, Tuple[float, float]]] = None,
) -> aiohttp.ClientResponse:
    abs_url, headers, data = self._prepare_request_raw(
        url, supplied_headers, method, params, files, request_id
    )
    
    if isinstance(request_timeout, tuple):
        timeout = aiohttp.ClientTimeout(
            connect=request_timeout[0],
            total=request_timeout[1],
        )else:
            timeout = aiohttp.ClientTimeout(
                total=request_timeout if request_timeout else TIMEOUT_SECS
            )
    ...

This solution is effective, but not fully effective: it can control the connection time and the entire request time, but it does not completely solve the timeout exception, because "request connection time" and "first character reading time" are two different things. The "request connection time" is based on totaltime retry (300s), and the gateway time is not set for that long.

Therefore, the author continues to propose "Solution 2".

Solution 2

Use monkey_patchthe method to rewrite openai.api_requestor.APIRequestor.arequest_rawthe method, focusing on rewriting the request_timeout parameter to support native aiohttp.client.ClientTimeoutparameters.

1. Create a new api_requestor_mp.pyfile and write the following code.

# 注意 request_timeout 参数已经换了，Optional[Union[float, Tuple[float, float]]] -> Optional[Union[float, tuple]]
async def arequest_raw(
        self,
        method,
        url,
        session,
        *,
        params=None,
        supplied_headers: Optional[Dict[str, str]] = None,
        files=None,
        request_id: Optional[str] = None,
        request_timeout: Optional[Union[float, tuple]] = None,
) -> aiohttp.ClientResponse:
    abs_url, headers, data = self._prepare_request_raw(
        url, supplied_headers, method, params, files, request_id
    )
    # 判断 request_timeout 的类型，按需设置 sock_read 和 sock_connect 属性
    if isinstance(request_timeout, tuple):
        timeout = aiohttp.ClientTimeout(
            connect=request_timeout[0],
            total=request_timeout[1],
            sock_read=None if len(request_timeout) < 3 else request_timeout[2],
            sock_connect=None if len(request_timeout) < 4 else request_timeout[3],
        )
    else:
        timeout = aiohttp.ClientTimeout(
            total=request_timeout if request_timeout else TIMEOUT_SECS
        )
    if files:
        # TODO: Use aiohttp.MultipartWriter to create the multipart form data here.
        # For now we use the private requests method that is known to have worked so far.
        data, content_type = requests.models.RequestEncodingMixin._encode_files(  # type: ignore
            files, data
        )
        headers["Content-Type"] = content_type
    request_kwargs = {
        "method": method,
        "url": abs_url,
        "headers": headers,
        "data": data,
        "proxy": _aiohttp_proxies_arg(openai.proxy),
        "timeout": timeout,
    }
    try:
        result = await session.request(**request_kwargs)
        util.log_info(
            "OpenAI API response",
            path=abs_url,
            response_code=result.status,
            processing_ms=result.headers.get("OpenAI-Processing-Ms"),
            request_id=result.headers.get("X-Request-Id"),
        )
        # Don't read the whole stream for debug logging unless necessary.
        if openai.log == "debug":
            util.log_debug(
                "API response body", body=result.content, headers=result.headers
            )
            return result
        except (aiohttp.ServerTimeoutError, asyncio.TimeoutError) as e:
            raise error.Timeout("Request timed out") from e
        except aiohttp.ClientError as e:
            raise error.APIConnectionError("Error communicating with OpenAI") from e

def monkey_patch():
    APIRequestor.arequest_raw = arequest_raw

2. Add the following to the header of the file that initializes the ChatGPT API:

from *.*.api_requestor_mp import monkey_patch

do_api_requestor = monkey_patch

After setting the parameters request_timeout=(10, 300, 15, 10), there will be no problem in debugging.

Delivered for testing, passed.

Experience summary

It is a bit difficult to look at the code and method calling links directly. You can find the calling links through the exception stack, which is more convenient.
The parameters exposed by ChatGPT API request_timeoutare not enough and need to be rewritten; I searched for rewriting solutions and found out that they monkey_patchare very practical.
During the project, the author found that changing the code itself is not difficult. What is difficult is knowing "where to change", "how to change" and "why".

To learn more about technical information, R&D management practices and other sharing, please pay attention to LigaAI. LigaAI - Intelligent R&D collaboration platform , welcome to experience intelligent R&D collaboration, and look forward to growing bigger and stronger with you!