Python source code analysis (a)

Original link: http://www.cnblogs.com/ybjourney/p/6139461.html

Recent want to learn under the Python source code, hoping to write a series of blog, recorded at the same time supervise their own learning.

Python source directory

Python.org download the source code from the archive and unpack, I downloaded the Python2.7.12, unpacked:

 

For the main folder made Description:

The include : Python contains all the header files provided, if needed their own use C or C ++ to write custom Python extension module, we need to use this header file;

Lib : includes all the standard library that comes with Python, written entirely in Python language;

Modules : includes all modules written in C;

Parser : Python interpreter Scanner and Parser (Python code for lexical analysis and syntax analysis), here also contains the tool automatically generates Python morphology and syntax according to the syntax of the function of the Python language;

Objects : All Python's built-in objects;

Python : Compiler and execution engine part of the Python interpreter is the core Python runtime lies !!!

Python is an object

  Objects can be said that the concept of a core Python, Python's world, everything is an object. We know that Python is written in C, C is not an object-oriented language, written in C and object-oriented Python does, then its target mechanism is how to achieve it?

  For the human mind, the image of the object can be described, but for a computer, the object is an abstract concept, the computer knows everything bytes. About the object, is often said that the object is set based on the operation data, and the data in the computer, an object is actually allocated to a memory, and this memory at a higher level as a whole, this as a whole is an object.

  In Python, the object is to structure C in the application of a memory heap.

Cornerstone mechanisms --Pyobject objects
in Python, all things are objects, and all objects have some of the same content, the content of these Python are defined in object.h in Pyobject in.

typedef struct _object {
    PyObject_HEAD
} PyObject;

Fixed length and variable length objects objects

In addition to Pyobject Python objects, as well as a representation structure pyVarObject such objects, pyVarobject is actually an extension of the pyobject.
Then the standing point of view of the source, the object becomes long object is added in pyVarobject variable length data, that is ob_size, defines the number of elements contained. The difference between fixed-length and variable length objects are objects: fixed-length object objects occupy different memory size is the same, different objects longer objects occupy memory may not be the same. Integer object such as '1' and '100' are memory size sizeof (PyIntObject), the string object "me" and "you" is not the same memory size.

Polymorphism Python object

  面向对象中一个重要的特性是多态,那么Python是如何实现多态的呢?
  在Python创建一个对象时,会分配内存,进行初始化,然后Python内部会使用一个PyObject*变量来保存和维护这个对象,Python中的所有对象均是如此。比如创建一个PyIntObject对象(整数对象),不是通过PyIntObject *变量来保存和维护这个对象,而是通过PyObject *,正因为所有对象均如此,所以Python内部各个函数之间传递的都是一种范型指针(Pyobject*),而这个指针所指的对象究竟是什么类型的,我们是不知道的,只能从指针所指对象的ob_type域动态进行判断,而正是这个域,Python实现了多态。

引用计数

  和C或C++不同,Python选择使用语言本身负责内存的管理和维护,也就是垃圾收集机制,代替程序员进行繁重的内存管理工作,而引用计数刚好是Python垃圾收集机制的一部分。
  Python通过对一个对象的引用计数来管理和维护对象在内存中的存在与否。Python中的一切皆是对象,在所有的对象中有一个ob_refcent变量,维护这对象的引用计数,从而也决定该对象的创建与消亡。
在Python中,使用Py_INCREF(op)和Py_DECREF(op)两个宏来增加和减少一个对象的引用计数,在每一个对象创建的时候,Python提供了一个Py_NewReference(op)宏来将对象的引用计数初始化为1。
  当一个对象的引用计数为0时,与该对象对应的析构函数将被调用, 但是调用析构函数并不一定是调用free释放内存空间,为了避免频繁的申请、释放内存空间,Python中使用的是内存对象池,维护一定大小的内存对象池,调用析构函数时,对象占用的空间将归还到内存池中。

Python中的整数对象
  在Python的所有对象中,整数对象最简单且使用最频繁,故我们首先学习整数对象。关于整数对象的源码在Objects.intobjects.c中,整数对象是通过PyIntObject对象来完成的,在创建一个PyIntObject对象之后,就再也不能改变该对象的值了。定义为:

typedef struct {
    prObject_HEAD;
    long ob_ival;
}PyIntObject;

  可以看到,Python中的整数对象其实是对C中long的一个简单封装,也就是整数对象维护的数据的长度在对象定义时就已经确定了,就是C中long的长度。
  在Python中,整数的使用是很广泛的,对应的,它的创建和释放也将会很频繁,那么如何设计一个高效的机制,使得整数对象的使用不会成为Python的瓶颈?在Python中是使用整数对象的缓冲池机制来解决此问题。使用缓冲池机制,那意味着运行时的整数对象并不是一个个独立的,而是相关联结成一个庞大的整数对象系统了。
小整数对象
  在实际的编程中,数值比较小的整数,比如1,2,等等,这些在程序中是频繁使用到的,而Python中,所有的对象都存活在系统堆上,也就是说,如果没有特殊的机制,对于小整数对象,Python将一次次的malloc在堆上申请空间,然后free,这样的操作将大大降低了运行效率。
那么如何解决呢?Python中,对小整数对象使用了对象池技术。
那么又有一个问题了,Python中的大对象和小对象如何区分呢?嗯,Python中确实有一种方法,用户可以调整大整数和小整数的分界点,从而动态的确定小整数的对象池中应该有多少个小整数对象,但是调整的方法只有自己修改源代码,然后重新编译。
大整数对象
  对于小整数,小整数对象池中完全的缓存PyIntObject对象,对于其它对象,Python将提供一块内存空间,这些内存空间将由这些大整数轮流使用,也就是谁需要的时候谁使用。
  比如,在Python中有一个PyIntBlock结构,维护了一块内存,其中保存了一些PyIntObject对象,维护对象的个数也可以做动态的调整。在Python运行的某个时刻,有一些内存已经被使用,而另一些内存则处于空闲状态,而这些空闲的内存必须组织起来,那样,当Python需要新的内存时,才能快速的获得所需的内存,在Python中使用一个单向链表(free_list)来管理所有的空闲内存。

#define BLOCK_SIZE      1000    /* 1K less typical malloc overhead */
#define BHEAD_SIZE      8       /* Enough for a 64-bit pointer */
#define N_INTOBJECTS    ((BLOCK_SIZE - BHEAD_SIZE) / sizeof(PyIntObject))

struct _intblock {
    struct _intblock *next;
    PyIntObject objects[N_INTOBJECTS];
};

typedef struct _intblock PyIntBlock;

static PyIntBlock *block_list = NULL;
static PyIntObject *free_list = NULL;

创建
  现在,我们已经大体知道Python中整数对象系统在内存是一种怎样的结构了,下面将介绍一个个PyIntObject是怎样的从无到有的产生。主要分为两步:
  如果小整数对象池机制被激活,则尝试使用小整数对象池;如果不能使用小整数对象池,则使用通用整数对象池。

以PyInt_FromLong说明:

 

PyObject *
PyInt_FromLong(long ival)
{
    register PyIntObject *v;
#if NSMALLNEGINTS + NSMALLPOSINTS > 0
    if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) {
        v = small_ints[ival + NSMALLNEGINTS];
        Py_INCREF(v);
#ifdef COUNT_ALLOCS
        if (ival >= 0)
            quick_int_allocs++;
        else
            quick_neg_int_allocs++;
#endif
        return (PyObject *) v;
    }
#endif
    if (free_list == NULL) {
        if ((free_list = fill_free_list()) == NULL)
            return NULL;
    }
    /* Inline PyObject_New */
    v = free_list;
    free_list = (PyIntObject *)Py_TYPE(v);
    PyObject_INIT(v, &PyInt_Type);
    v->ob_ival = ival;
    return (PyObject *) v;
}

 

 

转载于:https://www.cnblogs.com/ybjourney/p/6139461.html

Guess you like

Origin blog.csdn.net/weixin_30363509/article/details/94789453