Python 源码分析之运行时环境

版权声明:本文系原创,转发请注明出处,商业用途联系作者 https://blog.csdn.net/wenxueliu/article/details/80919911

python 运行时环境

运行环境是一个全局的概念,而执行环境就是指栈帧

当运行时环境已经准备好的时候,执行第一行代码的函数就是
PyEval_EvalFrame 函数

PyObject *
PyEval_EvalFrame(PyFrameObject *f) {
    /* This is for backward compatibility with extension modules that
       used this API; core interpreter code should call
       PyEval_EvalFrameEx() */
    return PyEval_EvalFrameEx(f, 0);
}

PyObject *
PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
{
    PyThreadState *tstate = PyThreadState_GET();
    return tstate->interp->eval_frame(f, throwflag);
}

PyInterpreterState *
PyInterpreterState_New(void)
{
    //...
    interp->eval_frame = _PyEval_EvalFrameDefault;
    //...
}

PyObject *
_PyEval_EvalFrameDefault(PyFrameObject *f, int throwflag)
{
    //...
    co = f->f_code;
    names = co->co_names;
    consts = co->co_consts;
    fastlocals = f->f_localsplus;
    freevars = f->f_localsplus + co->co_nlocals;
    first_instr = (_Py_CODEUNIT *) PyBytes_AS_STRING(co->co_code);
    //...
    next_instr = first_instr;
    if (f->f_lasti >= 0) {
        next_instr += f->f_lasti / sizeof(_Py_CODEUNIT) + 1;
    }
    stack_pointer = f->f_stacktop;
    f->f_stacktop = NULL;       /* remains NULL unless yield suspends frame */
}

PyCodeObject 对象的 co_code 域中保存着字节码和字节指令参数,
co_code 是一个 PyStringObject,而其中的字符数组保存了真正的指令。

Python 虚拟机的执行过程就是从 co_code 中

  1. 取指令
  2. 执行
  3. 回到 1

其中 fist_instr 指向第一条指令, next_instr 指向下一条指令, f_lasti 指向上一条
已经执行的指令的位置。

    for (;;) {
        if (_Py_atomic_load_relaxed(&eval_breaker)) {
            if (_Py_OPCODE(*next_instr) == SETUP_FINALLY ||
                _Py_OPCODE(*next_instr) == YIELD_FROM) {
                goto fast_next_opcode;
            }
            if (_Py_atomic_load_relaxed(&pendingcalls_to_do)) {
                if (Py_MakePendingCalls() < 0)
                    goto error;
            }
#ifdef WITH_THREAD
            if (_Py_atomic_load_relaxed(&gil_drop_request)) {
                /* Give another thread a chance */
                if (PyThreadState_Swap(NULL) != tstate)
                    Py_FatalError("ceval: tstate mix-up");
                drop_gil(tstate);

                /* Other threads may run now */

                take_gil(tstate);

                /* Check if we should make a quick exit. */
                if (_Py_Finalizing && _Py_Finalizing != tstate) {
                    drop_gil(tstate);
                    PyThread_exit_thread();
                }

                if (PyThreadState_Swap(tstate) != NULL)
                    Py_FatalError("ceval: orphan tstate");
            }
#endif
            /* Check for asynchronous exceptions. */
            if (tstate->async_exc != NULL) {
                PyObject *exc = tstate->async_exc;
                tstate->async_exc = NULL;
                UNSIGNAL_ASYNC_EXC();
                PyErr_SetNone(exc);
                Py_DECREF(exc);
                goto error;
            }
        }

    fast_next_opcode:
        f->f_lasti = INSTR_OFFSET();

        NEXTOPARG();

        switch (opcode) {
        TARGET(NOP)
            FAST_DISPATCH();

        TARGET(LOAD_FAST) {
            PyObject *value = GETLOCAL(oparg);
            if (value == NULL) {
                format_exc_check_arg(PyExc_UnboundLocalError,
                                     UNBOUNDLOCAL_ERROR_MSG,
                                     PyTuple_GetItem(co->co_varnames, oparg));
                goto error;
            }
            Py_INCREF(value);
            PUSH(value);
            FAST_DISPATCH();
        }

        PREDICTED(LOAD_CONST);
        TARGET(LOAD_CONST) {
            PyObject *value = GETITEM(consts, oparg);
            Py_INCREF(value);
            PUSH(value);
            FAST_DISPATCH();
        }
        #....

更进一步,python 在获得一条指令和其需要的参数时候,从 switch 中找到匹配的 case
, 具体 case 就是对该指令的具体实现。 执行 case 中的执行,之后继续循环。

不管程序执行成功或识别,返回值都保存在 why 中

整个执行过程就是一个 for 循环加 switch/case,整个指令执行都在 _PyEval_EvalFrameDefault 中

注:其中 why 就是 python 异常处理机制

线程与进程

通过 PyFrameObject 我们了解了执行一个函数的栈帧,通过 PyCodeObject
了解了代码段, 而代码执行的入口是 _PyEval_EvalFrameDefault 来执行代码,
但是,栈帧之外又是什么呢?了解操作系统执行,我们基本就知道是线程。

Python 通过线程模拟实际的物理 CPU 来执行指令

Python 的线程实际也是操作系统的物理线程,只是在上面封装了一层。

而线程一般都是依存于一个进程,在 Python 中一个进程是 PyInterpreterState 对象。
通常 Python 都是一个 PyInterpreterState 下面多个 PyThreadState,各个线程之间
共享一些资源,多个 PyThreadState 轮流使用一个字节码执行引擎。

PyThreadState 与 PyInterpreterState

一个 interpreter 中维护多个 PyThreadState

Python 中多线程之间的同步通过 GIL(Global Interpreter Lock)

typedef struct _is {

    struct _is *next;
    struct _ts *tstate_head;     //线程的头指针

    PyObject *modules;
    PyObject *modules_by_index;
    PyObject *sysdict;
    PyObject *builtins;
    PyObject *importlib;

    PyObject *codec_search_path;
    PyObject *codec_search_cache;
    PyObject *codec_error_registry;
    int codecs_initialized;
    int fscodec_initialized;

#ifdef HAVE_DLOPEN
    int dlopenflags;
#endif

    PyObject *builtins_copy;
    PyObject *import_func;
    /* Initialized to PyEval_EvalFrameDefault(). */
    _PyFrameEvalFunction eval_frame;  //执行引擎
} PyInterpreterState;

typedef struct _ts {
    /* See Python/ceval.c for comments explaining most fields */

    struct _ts *prev;            //上一个线程
    struct _ts *next;            //下一个线程
    PyInterpreterState *interp;  //所属进程

    struct _frame *frame;        //指向栈帧
    int recursion_depth;
    char overflowed; /* The stack has overflowed. Allow 50 more calls
                        to handle the runtime error. */
    char recursion_critical; /* The current calls must not cause
                                a stack overflow. */
    /* 'tracing' keeps track of the execution depth when tracing/profiling.
       This is to prevent the actual trace/profile code from being recorded in
       the trace/profile. */
    int tracing;
    int use_tracing;

    Py_tracefunc c_profilefunc;
    Py_tracefunc c_tracefunc;
    PyObject *c_profileobj;
    PyObject *c_traceobj;

    PyObject *curexc_type;
    PyObject *curexc_value;
    PyObject *curexc_traceback;

    PyObject *exc_type;
    PyObject *exc_value;
    PyObject *exc_traceback;

    PyObject *dict;  /* Stores per-thread state */

    int gilstate_counter;

    PyObject *async_exc; /* Asynchronous exception to raise */
    long thread_id; /* Thread id where this tstate was created */

    int trash_delete_nesting;
    PyObject *trash_delete_later;

    /* Called when a thread state is deleted normally, but not when it
     * is destroyed after fork().
     * Pain:  to prevent rare but fatal shutdown errors (issue 18808),
     * Thread.join() must wait for the join'ed thread's tstate to be unlinked
     * from the tstate chain.  That happens at the end of a thread's life,
     * in pystate.c.
     * The obvious way doesn't quite work:  create a lock which the tstate
     * unlinking code releases, and have Thread.join() wait to acquire that
     * lock.  The problem is that we _are_ at the end of the thread's life:
     * if the thread holds the last reference to the lock, decref'ing the
     * lock will delete the lock, and that may trigger arbitrary Python code
     * if there's a weakref, with a callback, to the lock.  But by this time
     * _PyThreadState_Current is already NULL, so only the simplest of C code
     * can be allowed to run (in particular it must not be possible to
     * release the GIL).
     * So instead of holding the lock directly, the tstate holds a weakref to
     * the lock:  that's the value of on_delete_data below.  Decref'ing a
     * weakref is harmless.
     * on_delete points to _threadmodule.c's static release_sentinel() function.
     * After the tstate is unlinked, release_sentinel is called with the
     * weakref-to-lock (on_delete_data) argument, and release_sentinel releases
     * the indirectly held lock.
     */
    void (*on_delete)(void *);
    void *on_delete_data;

    PyObject *coroutine_wrapper;
    int in_coroutine_wrapper;

    /* Now used from PyInterpreterState, kept here for ABI
       compatibility with PyThreadState */
    Py_ssize_t _preserve_36_ABI_1;
    freefunc _preserve_36_ABI_2[MAX_CO_EXTRA_USERS];

    PyObject *async_gen_firstiter;
    PyObject *async_gen_finalizer;

    /* XXX signal handlers should also be here */

} PyThreadState;

如果你能画出 PyInterpreterState, PyThreadState, PyFrameObject 之间
的关系图,那么就对 Pyhton 的执行引擎有了一个宏观上的认识。

总结

首先,在线程中创建栈帧的时候,会将当期栈帧执行

back = tstate->frame

多个按照链表的方式组织起来,新的栈帧通过 back 访问之前的栈帧

一个进程包含多个线程,各个线程通过双向链表组织起来,每个线程轮流执行指令

开始执行的时候,找到对应的栈帧,调用 eval_frame 来执行栈中
的指令。

猜你喜欢

转载自blog.csdn.net/wenxueliu/article/details/80919911