How to dynamically import a class or function from a file in Python

Suppose the module file name is data_used_to_test.py, placed in the tests folder

The folder structure is as follows:

project
    |-tests
        |-data_used_to_test.py

The file contains a test_class class:

class test_class():
    def test_func(arg):
        return "hello {}".format(arg)
        

The code is all based on Python3.6.4

  1. use imp

    1. Find modules with imp.find_module

      In [1]:file, pathname, description = imp.find_module('data_used_to_test', path=['tests/'])
      
      In [2]: file
      Out[2]: <_io.TextIOWrapper name='tests/data_used_to_test.py' mode='r' encoding='utf-8'>
      
      In [3]: pathname
      Out[3]: 'tests/data_used_to_test.py'
      
      In [4]: description
      Out[4]: ('.py', 'r', 1)
    2. Load found modules into sys.modules with imp.load_module

      In [5]: mod = imp.load_module('test_load', file, pathname, description)
      In [6]: mod
      Out[6]: <module 'test_load' from 'tests/data_used_to_test.py'>

      At this time, there will be an additional 'test_load' record in sys.modules, and the value is the value of mod.

    3. At this time, you can directly access the objects in the package through the mod

      In [7]: mod.test_class().test_func('x')
      Out[7]: 'hello x'

      The advantage of this method is that

      1. Simple and easy to implement.
      2. No special treatment for pyc files

      The downside is that the customizability is too low. Not suitable for frame use

      1. The source code of the module cannot be dynamically modified, and the open API must be very stable and will not change frequently.
      2. Objects that can be accessed are given fixed names from the start. Unable to dynamically register access.
    4. Problem 2 above can be solved with getattr, let's go a step further.

      In [8]: tmp = getattr(mod, 'test_class')
      
      In [9]: tmp
      Out[9]: test_load.test_class
      In [10]: tmp().test_func('l')
      Out[10]: 'hello l'

      In this way, by calling the prepared registration function in the external module in advance,

      Register a class or function in an external module to a global singleton variable,

      Implement dynamic module loading and object access.

      But still can't solve problem 1.

      So another method of module loading is needed.

  2. Explicitly define a load function

    import imp
    import sys
    
    def load_module(module_name, module_content):
        if fullname in sys.modules:
            return sys.modules[fullname]
    
        mod = sys.modules.setdefault(url, imp.new_module(module_name))
        mod.__file__ = module_name
        mod.__package__ = ''
    
        # if *.py
        code = compile(module_content, url, 'exec')
        # if *.pyc 有问题,我运行一直报错
        # code = marshal.loads(module_content[8:])
    
        exec(code, mod.__dict__)
        # 2的写法是 exec code in mod.__dict__
        # 其实就是让code在环境里运行一下,所以这里可能会有注入漏洞
        return mod

    This function takes the module's name and source content, compiles it into a code object using compile(), and executes it in a dictionary of newly created module objects.

    Here's how this function is used:

    In[1]: module_content = open('tests/data_used_to_test.py').read()
    
    In[2]: mod = load_module('test_import', module_content)

    The back is the same as 1.4.

    This solves both problems raised in 1.3 at the same time.

    Because you read the source code as a normal file first, you can make various modifications and then register it in sys.modules.

    Pocsuite uses this approach. Although I think they probably didn't quite get it at the time.

    Of course there are still shortcomings.

    1. Both methods above only support simple modules. Not embedded in the usual import statement. Higher-level constructs such as packages are not supported.
    2. not cool enough
  3. custom importer

    Two ways to customize importers are given on PythonCookbook

    1. Create a metapath importer
    2. Write a hook directly embedded in the sys.path variable

    I haven't understood it yet. The documentation of PEP302 is too long. I probably know that it is done by inheriting importlib.abc.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325332711&siteId=291194637