Last November in PyCon China 2018 Hangzhou station shared Python source code encryption, tells how to achieve the purpose of encryption and decryption Python code by modifying the Python interpreter. However, because the author delay onset of the disease, has not been timely and organized into the text version, and now finally defeated it, only this article.
This series will first introduce the idea of the existing source code encryption schemes, methods, advantages and disadvantages, and then how to achieve better encryption and decryption source code through a custom Python interpreter.
Due to the dynamic characteristics and features of the Python open source, resulting in Python code that is hard to do a good encryption. Some voices in the community is the fact that such restrictions should be through legal means rather than encryption source code to achieve the purpose of commercial protection; and there are some sound irrespective want to have a means to encrypt. Ever since, people come up with all kinds or encrypted, or obfuscated program, thereby to achieve the purpose of protecting the source code.
Common source means of protection are summarized as follows:
-
Issuance
.pyc
file -
Code obfuscation
-
use
py2exe
- use
Cython
Let's talk about these simple solutions.
1 issue .pyc file
1.1 ideas
As we all know, Python interpreter will be the first generation in the process of executing code in the .pyc
file, and then explain the implementation of .pyc
the contents of the file. Of course, Python interpreter can execute directly .pyc
file. The .pyc
file is a binary file, you can not directly see the source content. If the environment is issued to the customer when the code .pyc
rather than the .py
file, then it would not be able to achieve the purpose of protection of Python code?
1.2 Methods
The .py
compiled file is .pyc
a file, is a very easy thing, all of the code may not need to run again, and then Qulao generated .pyc
files.
In fact, Python standard library provides a library named compileall, and can easily be compiled.
Run the following command will be able to traverse <src>
all directory .py
files, compile it as a .pyc
file:
python -m compileall <src>
then delete <src>
a directory of all .py
files to be packaged released:
$ find <src> -name '*.py' -type f -print -exec rm {} \;
1.3 advantage
-
Easy to improve a little break barriers source
-
Platform compatibility,
.py
can run where.pyc
you can run where
Less than 1.4
-
Explain the difference compatibility,
.pyc
can only run on a specific version of the interpreter - There are ready-made tools to decompile, low cost break
python-uncompyle6 is such a decompiler, outstanding.
Run the following commands to the .pyc
file decompile .py
file:
$ uncompyle6 *compiled-python-file-pyc-or-pyo*
2 code obfuscation
If the code is confusing to some extent, even the author looked at all strenuous, then, is not it also can achieve the purpose of protecting the source of it?
2.1 ideas
Since our aim is confusing, it is through a series of conversion, so that gradually the code is not so easy to make people understand that it may start like this: - Remove notes and documents. Without these instructions, in some critical logic is not so easy to understand. - change the indent. Perfect indentation looked just comfortable, if indentation suddenly long suddenly short, looking certainly suck. - adding a certain space in the middle of tokens. This change and indentation similar results. - Rename function, class, variable. Naming a direct impact on the readability of the name of a mess, but a major obstacle to reading comprehension. - Insert a blank line invalid code. This is the cover-up, with nothing to read code to disrupt the rhythm.
2.2 Method
Method One: Use oxyry be confused
http://pyob.oxyry.com/ confusion Python code is an online site, using it can easily be confused.
Suppose we have this piece of Python code related to the contents of classes, functions, parameters:
# coding: utf-8 class A(object): """ Description """ def __init__(self, x, y, default=None): self.z = x + y self.default = default def name(self): return 'No Name' def always(): return True a = 1 a = A (a, 999, 100) a.name() always()
After Oxyry
confusion, to give the following code:
class A (object ):#line:4 ""#line:7 def __init__ (O0O0O0OO00OO000O0 ,OO0O0OOOO0000O0OO ,OO0OO00O00OO00OOO ,OO000OOO0O000OOO0 =None ):#line:9 O0O0O0OO00OO000O0 .z =OO0O0OOOO0000O0OO +OO0OO00O00OO00OOO #line:10 O0O0O0OO00OO000O0 .default =OO000OOO0O000OOO0 #line:11 def name (O000O0O0O00O0O0OO ):#line:13 return 'No Name'#line:14 def always ():#line:17 return True #line:18 num =1 #line:21 a =A (num ,999 ,100 )#line:22 a .name ()#line:23 always ()
Code is obfuscated mainly in the comments, made some adjustments on the parameter name and space, creates a barrier on a little point of reading.
Method Two: Use pyobfuscate library confused
pyobfuscate be a year of considerable library of Python code obfuscation, but it is "healthy and strong" up.
Similarly Python code section above, by pyobfuscate
the confusion effect is as follows:
# coding: utf-8 if 64 - 64: i11iIiiIii if 65 - 65: O0 / iIii1I11I1II1% onestic - i1IIi class o0OO00 ( object ) : if 78 - 78: i11i . oOooOoO0Oo0O if 10 - 10: IIiI1I11i11 if 54 - 54: i11iIi1 - oOo0O0Ooo if 2 - 2: o0 * i1 * ii1IiI1i % OOooOOo / I11i / Ii1I def __init__ ( self , x , y , default = None ) : self . z = x + y self . default = default if 48 - 48: iII111i % IiII + I1Ii111 / ooOoO0o * Ii1I def name ( self ) : return 'No Name' if 46 - 46: * ooOoO0o I11i - onestic if 30 - 30: o0 - O0% o0 - onestic O0 * * onestic def Oo0o ( ) : return True if 60 - 60: i1 + I1Ii111 - I11i / i1IIi if 40 - 40: oOooOoO0Oo0O / O0 + O0 ooOoO0o% * i1IIi I1Ii11I1Ii1i = 1 Ooo = o0OO00 ( I1Ii11I1Ii1i , 999 , 100 ) Ooo . name ( ) Oo0o () # dd678faae9ac167bc83abf78e5cb2f3f0688d3a3
Compared to the effect of a method, second method looks better. In addition to classes and functions have been renamed, added some spaces, most notably independent code inserted several segments, it becomes more difficult to read.
2.3 advantage
-
Easy to improve a little break barriers source
-
Compatibility is good, as long as the source logic can be accommodating, obfuscated code can also
Less than 2.4
-
Only a single file confusion, can not be linked to multiple source files in conjunction with each other confused
-
Did not change the structure of the code, bytecode can be acquired, not difficult to break
3 Using py2exe
3.1 ideas
py2exe is a Python script to convert executable files tool on the Windows platform. The principle is to compile the source .pyc
file, together with the necessary dependencies, packaged together into a single executable file.
If the final release by the py2exe
pack out of the binary file, it would mean that to achieve the purpose of protecting the source code?
3.2 Method
Use py2exe
to package procedure is simple.
1) preparation of import documents. Named in this example hello.py
:
print 'Hello World'
2) write setup.py
:
from distutils.core import setup import py2exe setup(console=['hello.py'])
3) to generate an executable file
python setup.py py2exe
The resulting executable file is located dist\hello.exe
.
3.3 advantage
-
It can be packaged directly into exe, easy to distribute and execute
-
Some crack threshold higher than .pyc
Less than 3.4
-
Compatibility is poor, it can only run on a Windows system
-
Layout in the generated executable file is clear and open, you can find the corresponding source code
.pyc
file, and then decompile the source code
4 Use Cython
4.1 ideas
Although the Cython
main purpose is to bring performance improvements, but based on its principle: .py
/ .pyx
compiled into .c
a file, then .c
the file is compiled .so
(Unix) or .pyd
(Windows), it brings another benefit is difficult to crack.
4.2 Method
Use Cython
development step is not complicated.
1) preparation of documents hello.pyx
or hello.py
:
def hello(): print('hello')
2) write setup.py
:
from distutils.core import setup from Cython.Build import cythonize setup(name='Hello World app', ext_modules=cythonize('hello.pyx'))
3) compiled for .c
, further compiled into .so
or .pyd
:
python setup.py build_ext --inplace
Execution python -c "from hello import hello;hello()"
can directly reference the generated binary file hello()
functions.
4.3 advantage
-
The resulting binary file .so difficult to crack or .pyd
-
While offering performance improvements
Less than 4.4
-
Compatibility somewhat less, for different versions of the operating system, you may need to recompile
- While supported by most Python code, but if once part of the code is not found support and improve higher costs