python source code analysis 0 ~ Chapter 2

Ready to work

Preface, this book earlier, source code analysis is for python2.5

1.python overall architecture

python architecture is divided into three parts

  • From left to right, respectively, python files, python interpreter, run-time environment
    • python interpreter:
      • scanner lexical analysis, a code into a token and
      • parser parsing establish AST
      • The compiler generates bytecode AST python
      • code Evaluator (VM) executed bytecodes
    • Runtime environment:
      • Built-in objects: list, dict, etc.
      • Memory allocator: malloc and layer interfaces
      • Operational status information: Maintenance interpreter different states in the implementation of the byte code (do not know specifically what)

2.python source organization

  • include: python provides all the header files for the user to write custom module c or c ++
  • Lib: contains all the python comes standard library, written in python
  • Modules: modules written in c
  • parser: Scanner and the interpreter portion Parser
  • Objects: built-in objects, list, dict, integer, etc.
  • Python: Compiler and execution engine part

Some tips: api to ending NEW as c ++ new, as of the end of the c mallor Malloc

On the subject a .python

python class object, the object is achieved by

Objects within 1.python

  • 'S perspective: the object data and the set of operations based on these data
  • Computer Perspective: the object is a piece of allocated memory space (continuous or discrete), may be considered as a whole at a higher level, this whole subject is
  • python all built-in types of objects are static initialization
  • Python Once an object is created, its memory size is immutable (the pointer is oriented by variable-length data)

1.1 PyObject

// object.h
#define PyObject_HEAD     \
    _PyObject_HEAD_EXTRA    \  // 一般情况下为空
    int ob_refcnt;  // 引用计数
    struct _typeobject *ob_type; //指定一个对象类型的类型对象
typedef struct _object{
} PyObject;
  • Each python objects in addition PyObject also require additional memory, PyObject defines the data must be of such PyIntObject
// intobject.h
typedef struct _object{
    long ob_ival;
} PyIntObject;
  • For variable-length python objects have a new abstract
// object.h
#define PyObject_VAR_HEAD     \
    PyObject_HEAD    \
    long ob_size;       /一般指容器内元素数量
typedef struct _object{
} PyVarObject;

1.2 Object Types

PyObject occupied memory size is meta-information object, is closely related to the type of meta information and the object belongs (The next section describes)

// object.h
typedef struct _typeobject{
    char *tp_name;  //print 信息"<module>.<name>"
    int tp_basicsize, item_size; //为了分配内存大小
    destructor tp_dealloc;
    printfunc tp_print;
    hashfunc tp_hash;
    ternaryfunc tp_call;
} PyTypeObject;

Creating objects 1

Create a python object There are two main ways

  • C API (for the built-in objects python, direct memory allocation)
    • Generic API AOL
    • Type the relevant API COL
  • Type of object creation (user object, because it is impossible in advance to provide such a method C), the following general process
    • Calls ob_type specified tp_new method PyTyoeObject class, if tp_new is tp_new method NULL retrospectively tp_base pointed ob_type, the final positioning of tp_new (because tp_new method for all classes inherit object will have security at the end) is responsible for memory applications (similar to c ++ is new)
    • After initialization by tp_init (similar to c ++ constructor)

Type of type 2

PyTypeObect also have ob_type actual property, which is PyType_Type (ie type, responsible for creating PyTypeObect, that metaclass)

The following figure illustrates an example int these relationships, the relationship is known python inside the class, object, type of

Other slightly

Integer object in two .python

Mainly PyIntObject and PyInt_Type, and other popular PyObject, PyTyoeObject nothing, there is concern

  • python2 long to store in a little smaller number directly with the C language, slightly larger number (more than long tolerance range) will be used to store long objects in python, but does not distinguish python3, unified storage to go with longObect , is used to achieve a flexible array interested can check out
  • Memory chain of small integer array of memory pool and large plastic object maintenance, avoid frequent malloc, describes the latter

2.1 small plastic objects

Good use of previously allocated object pool

// [intobject.c]
#define NSMALLPOSINTS           257
#define NSMALLNEGINTS           5
/* References to small integers are saved in this array so that they
   can be shared.
   The integers that are saved are those in the range
   -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
static PyIntObject *small_ints[NSMALLNEGINTS + NSMALLPOSINTS];

2.2 Large plastic objects

For other integer, Python runtime environment will provide a space for memory, this memory space is in turn used by these objects, to avoid the frequent malloc. In Python by PyIntBlock structure, to achieve this mechanism.

// [intobject.c]

#define BLOCK_SIZE      1000    /* 1K less typical malloc overhead */
#define BHEAD_SIZE      8       /* Enough for a 64-bit pointer */
#define N_INTOBJECTS    ((BLOCK_SIZE - BHEAD_SIZE) / sizeof(PyIntObject))

struct _intblock {
    struct _intblock *next;
    PyIntObject objects[N_INTOBJECTS];

typedef struct _intblock PyIntBlock;

static PyIntBlock *block_list = NULL;
static PyIntObject *free_list = NULL;

You can create code int understanding of the use of these two

// [intobject.c]
PyObject *
PyInt_FromLong(long ival)
    register PyIntObject *v;
#if NSMALLNEGINTS + NSMALLPOSINTS > 0 /* 尝试使用小整数对象池 */
    if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) {
        v = small_ints[ival + NSMALLNEGINTS];
        return (PyObject *) v;
    if (free_list == NULL) {
        if ((free_list = fill_free_list()) == NULL)
            return NULL;
    /* Inline PyObject_New */
    v = free_list;
    free_list = (PyIntObject *)v->ob_type;  /* 有一个看似不合适但是比较方便的地方,freelist会通过 ob_type存放可用空间的pyObject的地址(类似链表的next),而不是 PyTyoeObject */
    (void)PyObject_INIT(v, &PyInt_Type);
    v->ob_ival = ival;
    return (PyObject *) v;

The following are applications for the freelist, and maintenance freelist and block_list related code

// [intobject.c]
static PyIntObject *
    PyIntObject *p, *q;
    /* 申请大小为sizeof(PyIntBlock)的内存空间,并链接到已有的block list中 */
    p = (PyIntObject *) PyMem_MALLOC(sizeof(PyIntBlock));
    ((PyIntBlock *)p)->next = block_list;
    block_list = (PyIntBlock *)p;
    /* 将PyIntBlock中的PyIntObject数组——objects转变成单向链表*/
    p = &((PyIntBlock *)p)->objects[0];
    q = p + N_INTOBJECTS;
    while (--q > p)
        q->ob_type = (struct _typeobject *)(q-1); /* 上一段代码中所提到的不合适的地方
    Py_TYPE(q) = NULL;
    return p + N_INTOBJECTS - 1;

Thus, the freelist will point to memory addresses can be assigned, but if it is released from the previously assigned PyIntObject, the freelist needs to be released before the address can be reused, by the destructor be achieved PyIntObect

// [intobject.c]
static void
int_dealloc(PyIntObject *v)
    if (PyInt_CheckExact(v)) {   // 如果不是派生类这么执行,保证freelist的完整性
        v->ob_type = (struct _typeobject *)free_list;
        free_list = v;
    else                        // 如果是派生类,则执行正常的析构流程
        v->ob_type->tp_free((PyObject *)v);

Reproduced in: https: //

Guess you like