【python】Python closure (closure), decorator (Decorator) and registration mechanism (Registry) study notes.


foreword

Recently, I came into contact with the MMCV framework and found that the MMCV framework introduced a registration mechanism (Registry) method to facilitate the replacement of backbone, optimizer, learning strategy and other functional modules, which can effectively manage the content in the deep learning framework. At the same time, it is also convenient for users to build their own network through flexible collocation of external interfaces. In order to gain an in-depth understanding of the principle of the registration mechanism, I have studied and organized it, and shared it with readers and friends.

This article is mainly divided into four parts:
the first part is a brief introduction to pythonClosure principle, closures are the basis of decorators.
The second part is a brief introduction to pythonDecorator principle, the registration mechanism is the application scenario of the decorator.
The third part is a brief introductionregistration mechanism, with python code examples attached.
The fourth part is a brief introductionMMCV framework registration mechanism code, and attach relevant code comments.

Due to the shallowness of my study, if there is a deviation in my understanding from yours, I hope you guys can point it out in time. Finally, if you think it is helpful to you, you can give the little brother a like. ⌣ ¨ \ddot\smile¨

The reference materials are as follows:
1. Python function decorator | rookie tutorial
2. Understanding the concept of closure , author: alpha_panda.
3. Python3 namespace and scope
4. mmsegment custom model (8) , author: alex1801.
5. Registry registration mechanism , author: ~HardBoy~.
6. To understand Python decorators, it is enough to read this article , author: Liu Zhijun.

1. Premise concepts

Before you start learning, you need to understand 2 concepts in advance, as follows.
(1) For python, everything is an object, including functions themselves. Therefore, functions can be assigned to variables, and function calls can be made through variables. (Functions can also be assigned to functions)

def base():
    print("这是基础函数")
def derived():
    print("这是派生函数")
    
# 以变量的形式调用
Var = base
Var()
# 另外,函数也是可以赋值给函数的,以函数的形式调用
derived = base
derived()

insert image description here
Based on the above concepts, it can be concluded that functions can actually be passed as parameters to new functions

def base():
    print("这是基础函数")
def derived(func):
    func()
    print("这是派生函数")
    
# 以函数的形式调用
derived(base)

insert image description here
(2) An internal function can be defined in an external function, and the internal function can be returned.

def outsideFunc():
    print("这是外部函数")
    # 在派生函数内定义基本函数
    def insideFunc():
        print("这是内部函数")
    return insideFunc

Var = outsideFunc()
print(Var)
Var()

insert image description here

2. Python closure (closure)

The introduction to closures in this article is relatively simple. For details, please refer to the blog: Understanding the concept of closures , which is explained in great detail. It also lists the points where closure functions are prone to errors. This article focuses on registration functions, so it is only a brief introduction .

Concept: A function that can refer to variables outside the scope and can also be used inside the function is called a closure function. Variables outside this scope are called free variables. This free variable is different from the static variable modified by C++, and also different from the global variable, which is bound to the closure function.

Note two points:
1. Between the closure function and the closure function, the free variables will not interfere with each other.
2. The free variables of the closure function will be passed to the next closure function.

def outside():
    # 此处obj为自由变量
    obj = []
    # inside就被称为闭包函数
    def inside(name):
        obj.append(name)
        print(obj)
    return inside
    
Lis = outside()
Lis("1")
# 对闭包函数自由变量的修改会传到下一次闭包函数中。
Lis("2")
Lis("3")
print("-"*20)
# 闭包函数和闭包函数之间的自由变量不会相互影响。
Lis2 = outside()
Lis2("4")
Lis2("5")
Lis2("6")

insert image description here

As can be seen from the above example, obj is a free variable for the inside closure function. Every time the inside closure function is called, a new value will be added to obj and will be passed to the next call of the inside closure function.
Here, the role of the obj free variable relative to the inside closure function is equivalent to a "global variable", but the scope is only for the external environment including the free variable and the closure function. In the above example, it refers to the outside scope. Python calls this scope the scope outside the closure function (Enclosing).

Supplementary knowledge:
Python has four scopes:
local scope (Local): the innermost layer, containing local variables, such as inside a function/method.
Scope outside the closure function (Enclosing): contains non-local (non-local) and non-global (non-global) variables. For example, two nested functions, a function (or class) A contains a function B, then for the name in B, the scope in A is nonlocal.
Global scope (Global): the outermost layer of the current script, such as the global variables of the current module.
Built-in scope (Built-in): Contains built-in variables/keywords, etc., and is finally searched.
The sequence of rules is shown in the figure below:
insert image description here

As shown in the figure above, if python can’t find the corresponding function variable in the local scope, it will look for it locally outside the local scope (such as closures), and if it can’t find it, it will look for it in the global scope, and then go to the built-in function. domain. For more detailed content here, please refer to the blog Python3 namespace and scope .

3. python decorator (Decorator)

After understanding the principle of closure, let's look at the essence of decorators. A decorator is actually an advanced function that encapsulates a function through the closure principle and returns the function. The purpose of encapsulation is to expand additional functions on the basis of maintaining the original functions. For details, refer to the following examples.

def add(arr1, arr2):
    return arr1 + arr2
def decorator(func):
    def wrapper(arr1, arr2):
        print("实现新的功能,例如数据加1的功能")
        arr1 += 1
        arr2 += 1
        return func(arr1, arr2)
    return wrapper

print(add(2, 3))

add = decorator(add)
print(add(2, 3))

insert image description here

As can be seen from the code, we hope to add a new function of increasing the parameter by one on the basis of maintaining the function of adding data. As usual, we need to add new content to the previous function, which will destroy the original structure. Therefore, on the basis of retaining the original function structure, additional new functions can be added through decorators. The above decorator is a decorator function, and the following add = decorator(add) is its calling method, but in python, you can use the @ syntax to simplify the calling statement. As follows:

def decorator(func):
    def wrapper(arr1, arr2):
        print("实现新的功能,例如数据加1的功能")
        arr1 += 1
        arr2 += 1
        return func(arr1, arr2)
    return wrapper

@decorator
# @decorator就等价于add = decorator(add)
def add(arr1, arr2):
    return arr1 + arr2

print(add(2, 3))

You can see that @decorator is equivalent to add = decorator(add), "@" will pass the modified function as a parameter to the decorator function

Add a small point:
After passing through the decorator function, the content of the add function has actually become the wrapper function, so its other related description content has also become the description content of the wrapper, such as __name__. Therefore, if you want not to modify the original description content, you can use the decoration function @wraps(func) in the functools function package to achieve it. This function can copy the description content of the original function.

# 不加@wraps的情况下
def decorator(func):
    def wrapper(arr1, arr2):
        print("实现新的功能,例如数据加1的功能")
        arr1 += 1
        arr2 += 1
        return func(arr1, arr2)
    return wrapper
def add(arr1, arr2):
    return arr1 + arr2
#原本add函数
print(add.__name__)
#通过装饰器修饰后
add = decorator(add)
print(add.__name__)

insert image description here

from functools import wraps
# 添加@wraps的情况下
def decorator(func):
    @wraps(func)
    #@wraps(func) 等同于 wrapper = wraps(func)(wrapper)
    def wrapper(arr1, arr2):
        print("实现新的功能,例如数据加1的功能")
        arr1 += 1
        arr2 += 1
        return func(arr1, arr2)
    return wrapper

def add(arr1, arr2):
    return arr1 + arr2
    
#原本add函数
print(add.__name__)
#通过装饰器修饰后
add = decorator(add)
print(add.__name__)

insert image description here

4. Registration mechanism (Registry)

Concept: The registration mechanism is mainly to realize the mapping of the string entered by the user to the required function or class, which is convenient for project management and user use. The registration mechanism can build a mapping relationship through python decorators. For example: MMCV is also done through the decorator method.

There are three main steps to complete the registration mechanism:
(1) Write the registration mechanism class.
(2) Instantiate an object of the registration mechanism, that is, build the registry.
(3) Add content to and from the registry through the decorator principle, that is, to achieve content registration

4.1 Write the class of the registration mechanism.

class Registry:
    def __init__(self, name=None):
        # 生成注册列表的名字, 如果没有给出,则默认是Registry。
        if name == None:
            self._name = "Registry"
        self._name = name
        #创建注册表,以字典的形式。
        self._obj_list = {
    
    }

    def __registry(self, obj):
        """
        内部注册函数
        :param obj:函数或者类的地址。
        :return:
        """
        #判断是否目标函数或者类已经注册,如果已经注册过则标错,如果没有则进行注册。
        assert(obj.__name__ not in self._obj_list.keys()), "{} already exists in {}".format(obj.__name__, self._name)
        self._obj_list[obj.__name__] = obj

    def registry(self, obj=None):
        """
        # 外部注册函数。注册方法分为两种。
        # 1.通过装饰器调用
        # 2.通过函数的方式进行调用

        :param obj: 函数或者类的本身
        :return:
        """
        # 1.通过装饰器调用
        if obj == None:
            def _no_obj_registry(func__or__class, *args, **kwargs):
                self.__registry(func__or__class)
                # 此时被装饰的函数会被修改为该函数的返回值。
                return func__or__class
                                                
            return _no_obj_registry
        #2.通过函数的方式进行调用
        self.__registry(obj)

    def get(self, name):
        """
        通过字符串name获取对应的函数或者类。
        :param name: 函数或者类的名称
        :return: 对应的函数或者类
        """
        assert (name in self._obj_list.keys()), "{}  没有注册".format(name)
        return self._obj_list[name]

This registration mechanism class mainly includes three member functions, namely __registry, registry, get and two member variables self._name and self._obj_list.

Member variables:
1.self._name variable: Indicates the name of this registry, if not given, it defaults to Registry.
2. self._obj_list variable: represents the registry in the form of a dictionary, that is, the mapping relationship between character strings and corresponding function names.
Member functions:
1. Registry function: Register the incoming function in two ways, one is to modify the free variable self._obj_list through the closure function _no_obj_registry. The other is to complete the registration directly by passing the obj parameter of the registry function. The specific implementation of these two methods is to realize the registration through the __registry function.
2. The __registry function: register the function parameters passed in, and report an error if it exists, and complete the registration if it does not exist.
3. Get function: realize the mapping from the string name to the corresponding function or class name by searching the registry, and return the function or class with the corresponding name.

4.2 Create a registry

Based on the registration mechanism class, an object is instantiated, and this object is the registry we need.

# 生成注册表
REGISTRY_LIST = Registry("REGISTRY_LIST")

4.3 Content Registration

Add content to and from the registry through the decorator principle, that is, to achieve content registration. In the following example, register the create_by_decorator function through the statement @REGISTRY_LIST.registry().

@REGISTRY_LIST.registry()等价于
test_by_decorator = REGISTRY_LIST.registry()(test_by_decorator),
即_no_obj_registry(test_by_decorator)

# 通过装饰器调用
@REGISTRY_LIST.registry()
# @REGISTRY_LIST.registry()等价于test_by_decorator = REGISTRY_LIST.registry()(test_by_decorator),即_no_obj_registry(test_by_decorator)
def create_by_decorator():
    print("通过装饰器完成注册的函数")


def create_by_function():
    print("直接通过registry函数进行注册")
#当然也可以直接通过传入registry函数进行注册。
REGISTRY_LIST.registry(create_by_function)

#通过字符串来获取对应函数名称的函数
test1 = REGISTRY_LIST.get("create_by_decorator")
test1()
test2 = REGISTRY_LIST.get("create_by_function")
test2()

insert image description here

5. Registration mechanism of MMCV

Because I run the mmcv framework under windows. Therefore, the file path of my registry.py file is F:\SegFormer-master\mmcv-1.2.7\mmcv\utils\registry.py, readers can find the registty.py file location of mmcv in their own projects according to their own situation, Linux should be installed under the mmcv package. The overall code is as follows. We'll take it apart later to take a closer look.

import inspect
import warnings
from functools import partial

from .misc import is_seq_of

class Registry:
    """A registry to map strings to classes.

    Args:
        name (str): Registry name.
    """

    def __init__(self, name):
        self._name = name
        self._module_dict = dict()

    def __len__(self):
        return len(self._module_dict)

    def __contains__(self, key):
        return self.get(key) is not None

    def __repr__(self):
        format_str = self.__class__.__name__ + \
                     f'(name={
      
      self._name}, ' \
                     f'items={
      
      self._module_dict})'
        return format_str

    @property
    def name(self):
        return self._name

    @property
    def module_dict(self):
        return self._module_dict

    def get(self, key):
        """Get the registry record.

        Args:
            key (str): The class name in string format.

        Returns:
            class: The corresponding class.
        """
        return self._module_dict.get(key, None)

    def _register_module(self, module_class, module_name=None, force=False):
        if not inspect.isclass(module_class):
            raise TypeError('module must be a class, '
                            f'but got {
      
      type(module_class)}')

        if module_name is None:
            module_name = module_class.__name__
        if isinstance(module_name, str):
            module_name = [module_name]
        else:
            assert is_seq_of(
                module_name,
                str), ('module_name should be either of None, an '
                       f'instance of str or list, but got {
      
      type(module_name)}')
        for name in module_name:
            if not force and name in self._module_dict:
                raise KeyError(f'{
      
      name} is already registered '
                               f'in {
      
      self.name}')
            self._module_dict[name] = module_class

    def deprecated_register_module(self, cls=None, force=False):
        warnings.warn(
            'The old API of register_module(module, force=False) '
            'is deprecated and will be removed, please use the new API '
            'register_module(name=None, force=False, module=None) instead.')
        if cls is None:
            return partial(self.deprecated_register_module, force=force)
        self._register_module(cls, force=force)
        return cls

    def register_module(self, name=None, force=False, module=None):
        """Register a module.

        A record will be added to `self._module_dict`, whose key is the class
        name or the specified name, and value is the class itself.
        It can be used as a decorator or a normal function.

        Example:
            >>> backbones = Registry('backbone')
            >>> @backbones.register_module()
            >>> class ResNet:
            >>>     pass

            >>> backbones = Registry('backbone')
            >>> @backbones.register_module(name='mnet')
            >>> class MobileNet:
            >>>     pass

            >>> backbones = Registry('backbone')
            >>> class ResNet:
            >>>     pass
            >>> backbones.register_module(ResNet)

        Args:
            name (str | None): The module name to be registered. If not
                specified, the class name will be used.
            force (bool, optional): Whether to override an existing class with
                the same name. Default: False.
            module (type): Module class to be registered.
        """
        if not isinstance(force, bool):
            raise TypeError(f'force must be a boolean, but got {
      
      type(force)}')
        # NOTE: This is a walkaround to be compatible with the old api,
        # while it may introduce unexpected bugs.
        if isinstance(name, type):
            return self.deprecated_register_module(name, force=force)

        # use it as a normal method: x.register_module(module=SomeClass)
        if module is not None:
            self._register_module(
                module_class=module, module_name=name, force=force)
            return module

        # raise the error ahead of time
        if not (name is None or isinstance(name, str)):
            raise TypeError(f'name must be a str, but got {
      
      type(name)}')

        # use it as a decorator: @x.register_module()
        def _register(cls):
            self._register_module(
                module_class=cls, module_name=name, force=force)
            return cls

        return _register

Specifically, it can be seen that the Registry class of mmcv contains two member variables self._name and self._module_dict, and the main six member functions, name function, module_dict function, get function, _register_module function, deprecated_register_module function and register_module function. We briefly introduce each.

Member variables:
1.self._name variable: Indicates the name of this registry.
2.self._module_dict variable: represents the registry in the form of a dictionary, that is, the mapping relationship between strings and corresponding function names.
Member functions:
1. The name function and module_dict function are all decorated with the decorator @property. Python's built-in @property decorator is responsible for turning a method into a property call.
2. register_module function: complete the registration of the target class, the specific code is as follows, and the meaning of the function has been commented.

def register_module(self, name=None, force=False, module=None):
        """注册一个模型

        类名称将被添加到变量self._module_dict中, 该变量的键值是类别名或者专属名字。
        它可以通过装饰器或者函数直接调用。

        Example:
            >>> backbones = Registry('backbone')
            >>> @backbones.register_module()
            >>> class ResNet:
            >>>     pass

            >>> backbones = Registry('backbone')
            >>> @backbones.register_module(name='mnet')
            >>> class MobileNet:
            >>>     pass

            >>> backbones = Registry('backbone')
            >>> class ResNet:
            >>>     pass
            >>> backbones.register_module(ResNet)

        Args:
            name (str | None): 要注册的模块名称。如果未指定,则将使用类名。
            force (bool, optional): 是否用相同的名称重写现有的类。默认值:False。
            module (type): 要注册的模块类。
        """
        #---------------------------------------------------------------------
        #判断输入force参数是否正确。
        #---------------------------------------------------------------------
        if not isinstance(force, bool):
            raise TypeError(f'force must be a boolean, but got {
      
      type(force)}')
        #---------------------------------------------------------------------
        #注意:这是一个与旧api兼容的演练,而它可能会引入意想不到的错误。
        #---------------------------------------------------------------------
        if isinstance(name, type):
            return self.deprecated_register_module(name, force=force)
        #---------------------------------------------------------------------
        #判断module是否存在,如果存在则直接进行注册。并返回module
        #---------------------------------------------------------------------
        # use it as a normal method: x.register_module(module=SomeClass)
        if module is not None:
            self._register_module(
                module_class=module, module_name=name, force=force)
            return module
        #---------------------------------------------------------------------
        #判断输入的name参数是否正确
        #---------------------------------------------------------------------
        # raise the error ahead of time
        if not (name is None or isinstance(name, str)):
            raise TypeError(f'name must be a str, but got {
      
      type(name)}')
        #---------------------------------------------------------------------
        #如果module不存在,则通过装饰器的方式进行注册。
        #---------------------------------------------------------------------
        # use it as a decorator: @x.register_module()
        def _register(cls):
            self._register_module(
                module_class=cls, module_name=name, force=force)
            return cls

        return _register
  1. _register_module function: concretely implement the registration method of a function or class. The specific code is as follows, the function has been commented.
  def _register_module(self, module_class, module_name=None, force=False):
    """
    具体实现注册方法。
    :param module_class:需要注册的函数本身
    :param module_name:需要注册的函数名称。默认为None
    :param force:是否重写已经存在的函数,默认为False
    """
        #---------------------------------------------------------------------
        #判断module是否是class类。不是类则报错
        #---------------------------------------------------------------------
        if not inspect.isclass(module_class):
            raise TypeError('module must be a class, '
                            f'but got {
      
      type(module_class)}')
        #---------------------------------------------------------------------
        #判断module_name是否存在,不存在则默认函数本身名称
        #---------------------------------------------------------------------
        if module_name is None:
            module_name = module_class.__name__
        #---------------------------------------------------------------------
        #判断module_name是否是个字符串,或者列表
        #---------------------------------------------------------------------
        if isinstance(module_name, str):
            module_name = [module_name]
        else:
            assert is_seq_of(
                module_name,
                str), ('module_name should be either of None, an '
                       f'instance of str or list, but got {
      
      type(module_name)}')
        #---------------------------------------------------------------------
        #针对列表中的字符串进行注册。
        #---------------------------------------------------------------------
        for name in module_name:
            if not force and name in self._module_dict:
                raise KeyError(f'{
      
      name} is already registered '
                               f'in {
      
      self.name}')
            #完成注册
            self._module_dict[name] = module_class

4. Get function: realize the mapping from the string name to the corresponding function name by searching the registry, and return the function with the corresponding name.

    def get(self, key):
        """或者注册表的键值

        Args:
            key (str): 键值必须是字符串

        Returns:
            class: 键值对应的类.
        """
		return self._module_dict.get(key, None)

5. deprecated_register_module function: deprecated registration module. I don't understand this.

    def deprecated_register_module(self, cls=None, force=False):
        
        warnings.warn(
            'The old API of register_module(module, force=False) '
            'is deprecated and will be removed, please use the new API '
            'register_module(name=None, force=False, module=None) instead.')
        if cls is None:
            return partial(self.deprecated_register_module, force=force)
        self._register_module(cls, force=force)
        return cls

Simply register through the Registry of mmcv. The specific implementation is as follows. Of course, this is not how MMCV adds a registration module. For details on how to add modules, you can refer to the csdn blog mmsegment custom model (eight .

if __name__ == "__main__":

    from torch.optim.adam import Adam
    registry_list = Registry("OPTIM")
    registry_list.register_module(name="registry_adam", module=Adam)
    optim = registry_list.get("registry_adam")
    print(optim)
    print(registry_list.module_dict)

insert image description here

Summarize

This article summarizes the principles of python closures, decorators and registration mechanisms, and lists the code and output results. However, this article only touches on the most superficial part. When studying various reference blogs, I found that there is much more to closures and decorators than that. If you are interested in these aspects, you can enter the reference blog to continue learning. Finally, thanks for reading.

Guess you like

Origin blog.csdn.net/weixin_43610114/article/details/126182474