python3.5全局解释器锁GIL-实现原理浅析

python3全局解释器锁浅谈

本文环境python3.5.2。

python全局解释器锁

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple
native threads from executing Python bytecodes at once. This lock is necessary mainly
because CPython’s memory management is not thread-safe. (However, since the GIL
exists, other features have grown to depend on the guarantees that it enforces.)

定义如上，这就是python中的全局解释器锁，即在同一个Python进程中，在开启多线程的情况下，同一时刻只能有一个线程执行，因为cpython的内存管理不是线程安全，这样就导致了在现在多核处理器上，一个Python进程无法充分利用处理器的多核处理。在Python的多线程执行环境中，每个线程都需要去先争取GIL锁，等到获取锁之后，才能继续执行，随着python的发展，GIL的锁的竞争方式也随之发生了相应变化。在Python2中，虚拟机执行字节码是，通过计数计算的字节码指令条数，当执行字节码条数到达100条时，就放弃执行让其他线程去执行，而此时唤醒等待中的哪个线程执行则完全依赖于操作系统的调度，等线程唤醒后则继续执行100条指令后，然后放弃执行，依次循环；而从Python3.2之后，优化了争取GIL锁相关的内容，GIL的争取就主要利用线程间共享的全局变量进行同步获取GIL锁，使用的实现方式，已linux的pthread线程库为例，主要是利用了条件变量让线程间共享的全局变量进行同步，以此来达到获取和放弃GIL，其中等待的线程默认情况下在等待5000微妙后，当前正在执行的线程没有主动放弃GIL锁，则通过设置全局共享值，让正在运行的线程检测到需要让出GIL后则让出GIL锁，等待的线程则执行。

python3全局解释器锁实现小探

多线程的例子基于pthread实现，在运行启动多线程的python脚本时，在主线程执行时，会执行到PyEval_EvalFrameEx字节码解释器函数，该函数中有个for循环一直执行解析出来的字节码，

PyObject *
PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
{
    ...
    for (;;) {
        ...
    if (_Py_atomic_load_relaxed(&eval_breaker)) {
        if (*next_instr == SETUP_FINALLY) {
            /* Make the last opcode before
               a try: finally: block uninterruptible. */
            goto fast_next_opcode;
        }
    #ifdef WITH_TSC
        ticked = 1;
    #endif
        if (_Py_atomic_load_relaxed(&pendingcalls_to_do)) {
            if (Py_MakePendingCalls() < 0)
                goto error;
        }
    #ifdef WITH_THREAD
        if (_Py_atomic_load_relaxed(&gil_drop_request)) {
            /* Give another thread a chance */
            if (PyThreadState_Swap(NULL) != tstate)
                Py_FatalError("ceval: tstate mix-up");
            drop_gil(tstate);

            /* Other threads may run now */

            take_gil(tstate);

            /* Check if we should make a quick exit. */
            if (_Py_Finalizing && _Py_Finalizing != tstate) {
                drop_gil(tstate);
                PyThread_exit_thread();
            }

            if (PyThreadState_Swap(tstate) != NULL)
                Py_FatalError("ceval: orphan tstate");
        }
#endif
        /* Check for asynchronous exceptions. */
        if (tstate->async_exc != NULL) {
            PyObject *exc = tstate->async_exc;
            tstate->async_exc = NULL;
            UNSIGNAL_ASYNC_EXC();
            PyErr_SetNone(exc);
            Py_DECREF(exc);
            goto error;
        }
    }
     ...
}

在for循环执行字节码的时候，每执行一次操作码都会检查eval_breaker和gil_drop_request两个值，这两个值就是GIL中申请和释放的全局变量。

当子线程还没有开始运行的时候，此时主线程每次执行到这个函数检查时，都不符合条件，eval_breaker和gil_drop_request默认为0，此时就继续执行。当子线程调用了Modules/_threadmodule.c中的

static PyObject *
thread_PyThread_start_new_thread(PyObject *self, PyObject *fargs)
{
    ...
    PyEval_InitThreads(); /* Start the interpreter's thread-awareness */
    ident = PyThread_start_new_thread(t_bootstrap, (void*) boot);
    ...
}

此时开始新线程的时候调用了PyEval_InitThreads函数，

void
PyEval_InitThreads(void)
{
    if (gil_created())
        return;
    create_gil();
    take_gil(PyThreadState_GET());
    main_thread = PyThread_get_thread_ident();
    if (!pending_lock)
        pending_lock = PyThread_allocate_lock();
}

此时就会调用了gil_created检查是否创建了gil锁，create_gil初始化相应锁，因为基于pthread线程库实现线程，因为用了pthread的线程信号同步，同步的时候需要上锁，所以该函数就是给相关变量上锁，然后调用take_gil函数去获取gil锁，

static void take_gil(PyThreadState *tstate)
{
    int err;
    if (tstate == NULL)
        Py_FatalError("take_gil: NULL tstate");

    err = errno;
    MUTEX_LOCK(gil_mutex);                                  // 加锁

    if (!_Py_atomic_load_relaxed(&gil_locked))              // 检查锁，如果此时锁被释放了则直接获取锁        
        goto _ready;

    while (_Py_atomic_load_relaxed(&gil_locked)) {          // 检查锁是否被锁住
        int timed_out = 0;
        unsigned long saved_switchnum;

        saved_switchnum = gil_switch_number;
        COND_TIMED_WAIT(gil_cond, gil_mutex, INTERVAL, timed_out);          // 利用信号等待INTERVAL时候后，返回相关结果
        /* If we timed out and no switch occurred in the meantime, it is time
           to ask the GIL-holding thread to drop it. */
        if (timed_out &&
            _Py_atomic_load_relaxed(&gil_locked) &&
            gil_switch_number == saved_switchnum) {                         // 如果time_out为1并且锁没有被释放
            SET_GIL_DROP_REQUEST();                                         // 设置全局值让当前执行的线程释放锁
        }
    }
_ready:
#ifdef FORCE_SWITCHING
    /* This mutex must be taken before modifying gil_last_holder (see drop_gil()). */
    MUTEX_LOCK(switch_mutex);
#endif
    /* We now hold the GIL */
    _Py_atomic_store_relaxed(&gil_locked, 1);                               // 设置获取锁
    _Py_ANNOTATE_RWLOCK_ACQUIRED(&gil_locked, /*is_write=*/1);              

    if (tstate != (PyThreadState*)_Py_atomic_load_relaxed(&gil_last_holder)) {
        _Py_atomic_store_relaxed(&gil_last_holder, (Py_uintptr_t)tstate);
        ++gil_switch_number;
    }

#ifdef FORCE_SWITCHING
    COND_SIGNAL(switch_cond);
    MUTEX_UNLOCK(switch_mutex);
#endif
    if (_Py_atomic_load_relaxed(&gil_drop_request)) {                       // 因为获取运行的时候需要重置
        RESET_GIL_DROP_REQUEST();                                           // 如果gil_drop_request为1则重置
    }
    if (tstate->async_exc != NULL) {
        _PyEval_SignalAsyncExc();
    }

    MUTEX_UNLOCK(gil_mutex);                                                // 解锁互斥
    errno = err;
}

从该函数看出，当执行到while循环时，则检查是否上锁了，上锁后则调用了COND_TIMED_WAIT来让该线程等待一段时候后，如果在超时之前就获取条件变量则该等待线程被唤醒，此时gil_locked就被释放，此时就直接继续执行，如果等待指定时间后，等待超时，此时就调用SET_GIL_DROP_REQUEST设置eval_breaker和gil_drop_request为1，让正在运行的线程在运行for循环时检查到需要放弃当前gil锁了，此时等待超时的线程就会获取gil锁并获得执行机会。

COND_TIMED_WAIT内容如下，

#define COND_TIMED_WAIT(cond, mut, microseconds, timeout_result) \
{ \
    int r = PyCOND_TIMEDWAIT(&(cond), &(mut), (microseconds)); \
    if (r < 0) \
        Py_FatalError("PyCOND_WAIT(" #cond ") failed"); \
    if (r) /* 1 == timeout, 2 == impl. can't say, so assume timeout */ \
        timeout_result = 1; \
    else \
        timeout_result = 0; \
} \

调用了PyCOND_TIMEDWAIT宏定义，

/* return 0 for success, 1 on timeout, -1 on error */
Py_LOCAL_INLINE(int)
PyCOND_TIMEDWAIT(PyCOND_T *cond, PyMUTEX_T *mut, PY_LONG_LONG us)
{
    int r;
    struct timespec ts;
    struct timeval deadline;

    PyCOND_GETTIMEOFDAY(&deadline);
    PyCOND_ADD_MICROSECONDS(deadline, us);
    ts.tv_sec = deadline.tv_sec;
    ts.tv_nsec = deadline.tv_usec * 1000;

    r = pthread_cond_timedwait((cond), (mut), &ts);
    if (r == ETIMEDOUT)
        return 1;
    else if (r)
        return -1;
    else
        return 0;
}

从中可知，调用了pthread_cond_timedwait函数来检查线程等待一定时间后，等待的线程会被唤醒，或者在等待的时候就会通过cond等待线程会被唤醒。

扫描二维码关注公众号，回复： 2732784 查看本文章

所有一切迷雾都在此揭开，所有的操作都是围绕pthread中的pthread_cond_timedwait的使用来实现每个线程如果没有在等待的时候内没有让出GIL锁，则强制让出GIL。在python3.5.2中默认每个等待执行的线程的等待时间就是INTERVAL，即默认是5000微妙。

#define DEFAULT_INTERVAL 5000
static unsigned long gil_interval = DEFAULT_INTERVAL;
#define INTERVAL (gil_interval >= 1 ? gil_interval : 1)

有关pthread_cond_timedwait的具体使用大家可自行查阅相关文档，从create_gil到take_gil函数都是围绕该函数来准备的相关条件。

此时当等待超时时，返回的timeout_result=1，此时就会调用SET_GIL_DROP_REQUEST函数，设置值，此时正在运行在for的线程就是调用drop_gil(tstate)释放gil锁，

static void drop_gil(PyThreadState *tstate)
{
    if (!_Py_atomic_load_relaxed(&gil_locked))
        Py_FatalError("drop_gil: GIL is not locked");
    /* tstate is allowed to be NULL (early interpreter init) */
    if (tstate != NULL) {
        /* Sub-interpreter support: threads might have been switched
           under our feet using PyThreadState_Swap(). Fix the GIL last
           holder variable so that our heuristics work. */
        _Py_atomic_store_relaxed(&gil_last_holder, (Py_uintptr_t)tstate);
    }

    MUTEX_LOCK(gil_mutex);                                              // 加锁
    _Py_ANNOTATE_RWLOCK_RELEASED(&gil_locked, /*is_write=*/1); 
    _Py_atomic_store_relaxed(&gil_locked, 0);                           // 设置锁值
    COND_SIGNAL(gil_cond);                                              // 发送条件信号
    MUTEX_UNLOCK(gil_mutex);                                            // 解锁

#ifdef FORCE_SWITCHING
    if (_Py_atomic_load_relaxed(&gil_drop_request) && tstate != NULL) {
        MUTEX_LOCK(switch_mutex);
        /* Not switched yet => wait */
        if ((PyThreadState*)_Py_atomic_load_relaxed(&gil_last_holder) == tstate) {
        RESET_GIL_DROP_REQUEST();
            /* NOTE: if COND_WAIT does not atomically start waiting when
               releasing the mutex, another thread can run through, take
               the GIL and drop it again, and reset the condition
               before we even had a chance to wait for it. */
            COND_WAIT(switch_cond, switch_mutex);
    }
        MUTEX_UNLOCK(switch_mutex);
    }
#endif

}

此时就释放了锁，然后接着就继续执行take_gil等待着下一次被唤醒调用，至此就实现了多线程的运行调度，就看谁先获取锁，获得锁的在等待后就会被唤醒，python的多线程执行的调度基本思路如上所述，如有疏漏请批评指正。

总结

在Python3.2之后，有关GIL锁的最大的改变就是利用了操作系统提供的线程数据同步唤醒的机制，实现了每个线程的调度执行，并且每个线程获取GIL锁的执行时间大致在5000微妙，改变了以往通过执行字节码执行计数的线程调度方式，具体的Python实现的代码机制可参考pthread中有关pthread_cond_timedwait的示例代码，所有的加锁发送信号的操作都是围绕该函数的使用而来，本文只是简单的分析了GIL的相关流程与原理，如有错误请批评指正。

GIL相关内容链接：

有关GIL基础知识

python2有关GIL

python3.2之后新的GIL