Suppose the module file name is data_used_to_test.py, placed in the tests folder
The folder structure is as follows:
project
|-tests
|-data_used_to_test.py
The file contains a test_class class:
class test_class():
def test_func(arg):
return "hello {}".format(arg)
The code is all based on Python3.6.4
use imp
Find modules with imp.find_module
In [1]:file, pathname, description = imp.find_module('data_used_to_test', path=['tests/']) In [2]: file Out[2]: <_io.TextIOWrapper name='tests/data_used_to_test.py' mode='r' encoding='utf-8'> In [3]: pathname Out[3]: 'tests/data_used_to_test.py' In [4]: description Out[4]: ('.py', 'r', 1)
Load found modules into sys.modules with imp.load_module
In [5]: mod = imp.load_module('test_load', file, pathname, description) In [6]: mod Out[6]: <module 'test_load' from 'tests/data_used_to_test.py'>
At this time, there will be an additional 'test_load' record in sys.modules, and the value is the value of mod.
At this time, you can directly access the objects in the package through the mod
In [7]: mod.test_class().test_func('x') Out[7]: 'hello x'
The advantage of this method is that
- Simple and easy to implement.
- No special treatment for pyc files
The downside is that the customizability is too low. Not suitable for frame use
- The source code of the module cannot be dynamically modified, and the open API must be very stable and will not change frequently.
- Objects that can be accessed are given fixed names from the start. Unable to dynamically register access.
Problem 2 above can be solved with getattr, let's go a step further.
In [8]: tmp = getattr(mod, 'test_class') In [9]: tmp Out[9]: test_load.test_class In [10]: tmp().test_func('l') Out[10]: 'hello l'
In this way, by calling the prepared registration function in the external module in advance,
Register a class or function in an external module to a global singleton variable,
Implement dynamic module loading and object access.
But still can't solve problem 1.
So another method of module loading is needed.
Explicitly define a load function
import imp import sys def load_module(module_name, module_content): if fullname in sys.modules: return sys.modules[fullname] mod = sys.modules.setdefault(url, imp.new_module(module_name)) mod.__file__ = module_name mod.__package__ = '' # if *.py code = compile(module_content, url, 'exec') # if *.pyc 有问题,我运行一直报错 # code = marshal.loads(module_content[8:]) exec(code, mod.__dict__) # 2的写法是 exec code in mod.__dict__ # 其实就是让code在环境里运行一下,所以这里可能会有注入漏洞 return mod
This function takes the module's name and source content, compiles it into a code object using compile(), and executes it in a dictionary of newly created module objects.
Here's how this function is used:
In[1]: module_content = open('tests/data_used_to_test.py').read() In[2]: mod = load_module('test_import', module_content)
The back is the same as 1.4.
This solves both problems raised in 1.3 at the same time.
Because you read the source code as a normal file first, you can make various modifications and then register it in sys.modules.
Pocsuite uses this approach. Although I think they probably didn't quite get it at the time.
Of course there are still shortcomings.
- Both methods above only support simple modules. Not embedded in the usual import statement. Higher-level constructs such as packages are not supported.
- not cool enough
custom importer
Two ways to customize importers are given on PythonCookbook
- Create a metapath importer
- Write a hook directly embedded in the sys.path variable
I haven't understood it yet. The documentation of PEP302 is too long. I probably know that it is done by inheriting importlib.abc.