python c extension

Implementation method of Python extension--Python and C mixed programming

Foreword

Most Python extensions are written in C, but it is also easy to port to C ++.
In general, all code that can be integrated or imported into other python scripts can be called an extension.
The extension can be written in pure Python, or it can be extended in a compiled language such as C or C ++.
 
Even two computers with the same architecture should not share binary files with each other.
Compile Python and extensions on the computer. Because even a slight difference between the compiler or CPU.
 
Official documents
 
 

Reasons to extend the Python language:

1. Add / additional (non-Python) functions that provide parts of the core Python functions that are not provided, such as creating new ones
Data types or embedding Python into other existing applications must be compiled.
 
 
2. The efficiency of performance bottlenecks is improved. Interpreted languages ​​are generally slower than compiled languages. If you want to improve performance, all are rewritten as compiled
The language is not cost-effective. A good practice is to do performance testing first, find out the performance bottleneck, and then implement the bottleneck in the expansion.
It is a relatively simple and effective approach.
 
 
3. Maintain the privacy of proprietary source code. A common drawback of scripting languages ​​is that they are all executed source code, and confidentiality is lost.
Transferring part of the code from Python to the compiled language can keep the proprietary source code private. Not easily reverse engineered
This is important when it comes to special algorithms, encryption methods, and software security.
 
 
Another way to keep the code secret is to only release the pre-compiled .pyc file, which is a compromise method.
 
 

Steps to create a Python extension

1. Create the application code

# include <stdio.h >
# include <stdlib.h >
# include <string.h >

# define BUFSIZE 10

int fac( int n) {
    if (n < 2)
        return 1;
    return n * fac(n - 1);
}

char *reverse( char *s) {
    register char t;
    char *p = s;
    char *q = (s + (strlen(s) - 1));
    while (p < q) {
        t = *p;
        *p ++ = *q;
        *q -- = t;
    }
    return s;
}

int main() {
    char s[BUFSIZE];
    printf( "4! == %d\n", fac( 4));
    printf( "8! == %d\n", fac( 8));
    printf( "12! == %d\n", fac( 12));
    strcpy(s, "abcdef");
    printf( "reversing 'abcdef', we get '%s'\n", reverse(s));
    strcpy(s, "madam");
    printf( "reversing 'madam', we get '%s'\n", reverse(s));
    return 0;
}
Generally, you need to write the main () function for unit testing
 
Compile with gcc
>gcc Extest.c -o Extest
carried out
>./Extest
 

2. Use the template to package the code

The implementation of the entire extension revolves around the concept of "wrapping". Your design should be as seamless as possible for your implementation language and Python.
The code of the interface is also called "boilerplate" code. It is an essential part of the interaction between your code and the Python interpreter:
Our boilerplate code is divided into 4 steps:
a. Contains python header files
Need to find where the python header file is, usually in /usr/local/include/python2.x
Add #include "Python.h" to the above C code
 
 
b. Add a wrapper function of type PyObject * Module_func () to each function of each module
The purpose of the wrapper function is to first pass the value of python to c, and then convert the calculation result of the function in c into a Python object and return it to python.
Need to add a static function for all functions that want to be accessed by the Python environment, the return type is PyObject *, and the function name format is
Module name_function name;
static PyObject * Extest_fac (PyObject * self, PyObject * args) {
    int res; // Calculation result value
    int num; // Parameter
    PyObject * retval; // Return value

    // I means that the parameter type that needs to be passed in is integer, If yes, assign value to num, if not, return NULL;
    res = PyArg_ParseTuple (args, "i" , & num);
    if ( ! Res) {
        // The wrapper function returns NULL, it will generate a TypeError in Python call abnormal
        return NULL;
    }
    RES = fac (num);
    // The result of the calculation in c needs to be converted into a python object, i represents the integer object type.
    retval = (PyObject * ) Py_BuildValue ( "i" , res);
    return retval;
}
It can also be written in a shorter, more readable form:
static PyObject * Extest_fac(PyObject *self, PyObject *args) {
    int m;
    if ( !(PyArg_ParseTuple(args, "i", &num))) {
        return NULL;
    }
    return (PyObject *)Py_BuildValue( "i", fac(num));
}
下面是python和c对应的类型转换参数表:
这里还有一个Py_BuildValue的用法表:
 
reverse函数的包装也类似:
static PyObject *
Extest_reverse(PyObject *self, PyObject *args) {
    char *orignal;
    if ( !(PyArg_ParseTuple(args, "s", &orignal))) {
        return NULL;
    }
    return (PyObject *)Py_BuildValue( "s", reverse(orignal));
}
也可以再改造成返回包含原始字串和反转字串的tuple的函数
static PyObject *
Extest_doppel(PyObject *self, PyObject *args) {
    char *orignal;
    if ( !(PyArg_ParseTuple(args, "s", &orignal))) {
        return NULL;
    }
    //ss,就可以返回两个字符串,应该reverse是在原字符串上进行操作,所以需要先strdup复制一下
    return (PyObject *)Py_BuildValue( "ss", orignal, reverse(strdup(orignal)));
}
上面的代码有什么问题呢?
和c语言相关的问题,比较常见的就是内存泄露。。。上面的例子中,Py_BuildValue()函数生成
要返回Python对象的时候,会把转入的数据复制一份。上面的两个字符串都被复制出来。但是
我们申请了用于存放第二个字符串的内存,在退出的时候没有释放掉它。于是内存就泄露了。
 
正确的做法是:先生成返回的python对象,然后释放在包装函数中申请的内存。
static PyObject *
Extest_doppel(PyObject *self, PyObject *args) {
    char *orignal;
    char *reversed;
    PyObject * retval;
    if ( !(PyArg_ParseTuple(args, "s", &orignal))) {
        return NULL;
    }
    retval = (PyObject *)Py_BuildValue( "ss", orignal, reversed =reverse(strdup(orignal)));
    free(reversed);
    return retval;
}
 
c. 为每个模块增加一个型如PyMethodDef ModuleMethods[]的数组
我们已经创建了几个包装函数,需要在某个地方把它们列出来,以便python解释器能够导入并调用它们。
这个就是ModuleMethods[]数组所需要做的事情。
格式如下 ,每一个数组都包含一个函数的信息,最后一个数组放置两个NULL值,代表声明结束
static PyMethodDef
ExtestMethods[] = {
    { "fac", Extest_fac, METH_VARARGS},
    { "doppel", Extest_doppel, METH_VARARGS},
    { "reverse", Extest_reverse, METH_VARARGS},
    {NULL, NULL},
};
METH_VARARGS代表参数以tuple的形式传入。如果我们需要使用PyArg_ParseTupleAndKeywords()
函数来分析关键字参数的话,这个标志常量应该写成: METH_VARARGS & METH_KEYWORDS,进行逻辑与运算。
 
 
d. 增加模块初始化函数void initMethod()
最后的工作就是模块的初始化工作。这部分代码在模块被python导入时进行调用。
void initExtest() {
    Py_InitModule( "Extest", ExtestMethods);
}
 
最终代码如下:
# include <stdio.h >
# include <stdlib.h >
# include <string.h >
# include "Python.h"

# define BUFSIZE 10

int fac( int n) {
    if (n < 2)
        return 1;
    return n * fac(n - 1);
}

char *reverse( char *s) {
    register char t;
    char *p = s;
    char *q = (s + (strlen(s) - 1));
    while (p < q) {
        t = *p;
       *p ++ = *q;
       *q -- = t;
    }
    return s;
}

static PyObject *
Extest_fac(PyObject *self, PyObject *args) {
    int res;
    int num;
    PyObject * retval;

    res = PyArg_ParseTuple(args, "i", &num);
    if ( !res) {
        return NULL;
    }
    res = fac(num);
    retval = (PyObject *)Py_BuildValue( "i", res);
    return retval;
}

static PyObject *
Extest_reverse(PyObject *self, PyObject *args) {
    char *orignal;
    if ( !(PyArg_ParseTuple(args, "s", &orignal))) {
        return NULL;
    }
    return (PyObject *)Py_BuildValue( "s", reverse(orignal));
}

static PyObject *
Extest_doppel(PyObject *self, PyObject *args) {
    char *orignal;
    char *resv;
    PyObject *retval;
    if ( !(PyArg_ParseTuple(args, "s", &orignal))) {
        return NULL;
    }
    retval = (PyObject *)Py_BuildValue( "ss", orignal, resv =reverse(strdup(orignal)));
    free(resv);
    return retval;
}

static PyMethodDef
ExtestMethods[] = {
    { "fac", Extest_fac, METH_VARARGS},
    { "doppel", Extest_doppel, METH_VARARGS},
    { "reverse", Extest_reverse, METH_VARARGS},
    {NULL, NULL},
};

void initExtest() {
    Py_InitModule( "Extest", ExtestMethods);
}

int main() {
    char s[BUFSIZE];
    printf( "4! == %d\n", fac( 4));
    printf( "8! == %d\n", fac( 8));
    printf( "12! == %d\n", fac( 12));
    strcpy(s, "abcdef");
    printf( "reversing 'abcdef', we get '%s'\n", reverse(s));
    strcpy(s, "madam");
    printf( "reversing 'madam', we get '%s'\n", reverse(s));
    test();
    return 0;
}
 
 

3. 编译与测试

为了让你的新python扩展能够被创建,你需要把它们与python库放在一起编译。python中的distutils包被
用来编译,安装和分发这些模块,扩展和包。步骤如下:
a. 创建setup.py
我们在安装python第三方包的时候,很多情况下会用到python setup.py install这个命令,
下面我们来了解一下setup.py文件的内容。
 
编译的最主要的内容由setup函数完成,你需要为每一个扩展创建一个Extension实例,在这里我们只有一个
扩展,所以只需要创建一个实例。
Extension('Extest', sources=['Extest.c']),第一个参数是扩展的名字,如果模块是包的一部分,还需要加".";
第二个参数是源代码文件列表
setup('Extest', ext_modules=[...]),第一个参数表示要编译哪个东西,第二个参数列出要编译的Extension对象。
#!/usr/bin/env python
from distutils.core import setup, Extension
    MOD = 'Extest'
    setup(name =MOD, ext_modules =[Extension(MOD, sources[ 'Extest.c'])])
setup函数还有很多选项可以设置。详情可见官网。
 
 
b. 通过运行setup.py来编译和连接你的代码
在shell中运行命令
>python setup.py build
当你报错如:无法找到Python.h文件
那么说明你没有安装python-dev包,需要去官网下载源码包重装自己编译安装一下python。
Python.h文件一般会出现在/usr/include/Python2.X文件夹中,我这里反正是没有的。。。
只有重新编译一个python...
 
我现在linux系统上的python版本是2.6.6,我下载一个相同版本的源码,也可以下载更高版本。
 
解压源码包
> tar xzf Python-2.6.6.tgz
> cd Python-2.6.6.tgz
编译安装Python
> ./configure --prefix=/usr/local/python2.6
> make
> sudo make install
创建一个新编译python的链接
> sudo ln -sf /usr/local/python2.6/bin/python2.6 /usr/bin/python2.6
测试一下,可用
使用这种方法可以在Linux上运行不同版本的python.
 
Python.h文件也在/usr/local/python2.6/include/python2.6路径下找到。
重新运行编译
 
编译成功后,你的扩展就会被创建在bulid/lib.*目录下。你会看到一个.so文件,这是linux下的
动态库文件:
 
c. 进行调试
你可以直接用python代码调用进行测试:
#!/usr/bin/python
from ctypes import *
import os
#需要使用绝对路径
extest = cdll.LoadLibrary( os.getcwd() + '/Extest.so')
print extest.fac( 4)
 
也可以在当前目录下执行命令,安装到你的python路径下
> python setup.py install
安装成功的话,直接导入测试:
 
最后需要注意一点的是,原来的c文件中有一个main函数,因为一个系统中只能有一个main
函数,所以为了不起冲突,可以把main函数改成test函数,再用Extest_test()包装函数处理一下,
再加入ExtestMethods数组,这样就可以调用这个测试函数了。
static PyObject *
Extest_test(PyObject *self, PyObject *args) {
    test();
    #返回空的话,就使用下面这一句 
     return (PyObject *)Py_BuildValue( "");
}

简单性能比较

测试代码
import Extest
import time

start = time.time()
a = Extest.reverse( "abcd")
timeC = time.time() - start
print 'C costs', timeC, 'the result is', a

start = time.time()
b = list( "abcd")
b.reverse()
b = ''.join(b)
timePython = time.time() -start
print 'Python costs', timePython, 'the result is', b
运行结果
可以看出,python也不是绝对比C慢嘛,还要看情况。

发布了30 篇原创文章 · 获赞 74 · 访问量 23万+

Guess you like

Origin blog.csdn.net/ruiyiin/article/details/28101753