[Python garbage collection] --2019-08-07 10:12:51

Original: http://106.13.73.98/__/26/

Python's GCmodule mainly use the reference count (reference counting) to track and recover waste. On the basis of the reference count, you can also mark - clear (mark and sweep) container object to resolve circular reference problem may arise. By generational recovery to further improve the garbage collection efficiency (generation collection) space for time.

@
___

Reference count

In Python, the life cycle of most objects are objects through reference counting to manage. Broadly speaking, the reference count is also a garbage collection mechanism, but also one of the most intuitive and easiest garbage collection count.

Reference count of principle:

  1. When a reference is created or copied object, the object's reference count is incremented;
  2. When a reference to an object is destroyed, the object's reference count by 1;
  3. When an object's reference count is 0, it means that the object is no longer used, to free up memory.

    There is a class A, see if the following operations:

    class A:
        def __init__(self):
            print('初始化完成')
    
        def __del__(self):
            print('使命完成,我去也')
    
    
    a1 = A()
    a2 = a1
    a3 = a1
    
    
    del a1
    print('删除a1')
    
    del a2
    print('删除a2')
    
    del a3
    print('删除a3')
    
    
    print('代码执行结束')
    
    
    """
    打印顺序为:
        初始化完成
        删除a1
        删除a2
        使命完成,我去也
        删除a3
        代码执行结束
    """

            First, we create an Aobject class: a1and defined by executing the assignment a2, in a3which case they point to a common memory address, professional explanation: Areference to the class object count is 3. Then perform dela keyword, delete a reference to the object, when reference is 0, the object was only revoked, this is the Python reference counting garbage collection mechanism.

A reference count biggest advantage - Real-time:
        although the reference count must be added to the reference count management action each time allocate and free memory, but compared to other mainstream garbage collection count, reference count has a maximum advantage, That "real time." Any memory, once there is no reference to it, it will immediately be recovered. While other garbage collection mechanism must meet under certain special conditions (must be a memory allocation failure) to be invalid memory recovery.

Execution efficiency reference counting:
        reference to the maintenance of a reference counting mechanism counts of additional memory allocation operation carried out in the Python runtime, release and citations assignment is directly proportional. And this point compared to other mainstream garbage collection mechanism, such as "mark - sweep", "Stop - Copy", it is a weakness. Because the additional operations of these technologies but substantially related to the amount of memory to be recovered.

Reference count Achilles heel - a circular reference:
        If the efficiency is just a reference counting mechanism of weakness, then root Unfortunately, there is a reference counting mechanism fatal weakness, precisely because of this weakness, so that a narrow garbage collection the reference count never included, this can lead to a fatal weakness is circular references (also as a cross-reference).

The so-called circular references:

        Circular reference can be made a group of object's reference count is not 0, however, these objects are not actually any referenced external objects, they are mutually reference only. This means that some people will not use this group of objects, this group should reclaim the memory space occupied by the object, and then refer to each other due to the presence of each object reference count is not zero, so these objects never occupied memory It will be released.

>>> a = []
>>> b = [a]
>>> a.append(b)
>>> print(a)
[[[...]]]

This is deadly, which were generated by a memory leak memory management is no different from the manual. To solve this problem, Python introduced the other garbage collection mechanism to make up for deficiencies reference counting: "Mark - Clear", "generational recovery."

Mark - Clear

"Mark - sweep" is to solve the problem of circular references. It may also contain references to other objects in the container object (for example: list, set, dict, class, instance) may produce a circular reference.

        We must recognize the fact that if two objects reference count is 1, but the mere existence of circular references between them, the two objects are to be recovered, that is, their reference count non-performance 0, but actually effective reference count is zero. We must take off the first reference cycle, the effective count these two objects coming out of it. Suppose two objects A, B, we proceed from A, B because it has a pair of reference, then the reference count B 1; B and then along the reference arrival, because there is a reference to B of A, A likewise rEFERENCE minus 1; this completes the removal target loop references.

        But such a problem, assume that object A has an object reference C, and C does not refer to A, if the C counter is decremented by 1, and the last A has not been recovered, it is clear that we error will C reference count by 1, which will lead to some future moment came a dangling reference to the C's. This requires us to recover C in the case of A is not deleted the reference count, such as through the use of such a scheme, then the complexity of maintaining a reference count will be doubled.

Mark - clear principle:
        "mark - sweep" with a better way, we do not change the real reference count, but the reference count objects in the collection make a copy, a copy of the referenced object changes. For a copy of any changes made will not affect the maintenance object life cycle.
        The only effect the copy count is to find root object set (the set of objects that can not be recycled). When looking to successfully root object After collection, the first memory is now divided into two lists, a list maintained in the root object collection, become root list, while the other maintained a list of the remaining objects become unreachable list. The reason for Poucheng two lists, based on such a consideration: Now unreachable linked list of objects that may be present are root objects of the list referenced directly or indirectly, these objects can not be recovered, once the process of the tag found such objects, it will be from unreachable move list root the list; after completion flag, unreachable list of all the remaining objects is a veritable garbage objects, the next garbage collection is limited to just unreachable list It can be.

Generational recovery

        Background: The generational garbage collection techniques is In a garbage collection mechanism developed in the early 80s of the last century, a series of studies have shown that: No matter what language to use to develop, regardless of what type of development, the scale of any program , there is so little in common. Namely: a certain percentage of the declaration period shorter than the memory block, usually a time millions of machine instructions, while the rest of the block of memory, long life cycle, even from the beginning until the end of the program to program.

        From the front - such as garbage collection "flag clear" view, such additional operations to bring garbage collection mechanism is actually the total number of memory blocks in the system is associated, for a long time when the number of blocks of memory need to be recovered, the more spam detection brings additional operations, additional operations and less garbage collection bring; Conversely, when the memory block to be recovered is small, the garbage detecting additional manipulation to bring more than garbage collection. To improve the efficiency of garbage collection, using tactics space for time .

Generational Recovery Principle:
        The system for all memory blocks increases as "generation" of survival time according to its reduced survival divided into different sets, each set becomes a "generation", the frequency of garbage collection. In other words, the longer the time of live objects, the less likely it is garbage, it should reduce the frequency of garbage collection to it. So how to measure the survival time of it, usually use several garbage collection action to measure, the more the number of garbage collection if an object passes can be drawn: the object longer survival time.

Original: http://106.13.73.98/__/26/

Guess you like

Origin www.cnblogs.com/gqy02/p/11313566.html