Python garbage collection (GC)

GC algorithm in Python

  • It is divided into three points:
    • Reference count
    • Mark - Clear
    • Generational recovery
  • Brief:
    • Python in the GC module mainly uses reference counting to track and garbage on the basis of the reference count on, you can also "tag - Clear." The question circulating solve container object that may arise cited by generational reclaim space for time further submit garbage collection efficiency
    • Mark - Clear:
      • Mark - Cleanup break the circular reference, it is only concerned about those objects might produce a circular reference.
      • Cons: The extra operating mechanism caused by memory and need to be recovered is proportional.
    • Generational recovery:
      • The memory system in accordance with all of its survival divided into different sets, each set as a "generation", increases as the frequency of garbage collection 'generations' survival time is reduced, i.e., live the longer the object, the less likely it is garbage, it should reduce the frequency of garbage collection to it.
      • So how to measure the survival time: usually use several garbage collection action to measure, if it is more of an object through the garbage phone number, it can be drawn: the object longer survival time.
  • It will be specifically explained for the three recovery mechanisms:

    • Reference count (primary)

      • In everything in Python is an object. In the heart of every Python object is a structure PyObject, its interior has a reference counter (ob_refcnt)

      • It means that the reference count, because an object is referenced New method is created when the New method, he reference count is 1. If he be referenced by other objects (e.g., b = a, the function to be thrown into the waiting list will add 1 to the reference count), if it is referenced object is deleted (DEL before on the basis of b) then its reference count will be reduced, know the reference count becomes 0, garbage collection will reclaim it.

      • Advantages / disadvantages:

        • Simple, real-time
      • Disadvantages:

        • High maintenance (simple real-time, but the extra occupied part of the resource, although the logic is simple, but the trouble)

        • Not solve the problem: circular references

          a=[1,2]
          b=[2,3]
          a.append(b)
          b.append(a)
          DEL a
          DEL b
        • To tell the truth after feeling a bit like a deadlock problems that may arise in the loop structure LIst, Dict, Object wait, if the code references between a, b is 1, and a, b referenced objects deleted each minus 1 (so their respective reference count is 1) this situation is not resolved, it introduces us to the following topics: mark - Clear

      • Mark - Clear

        • Clear labeling is used to solve the problem of circular references, only container objects appear references cycle, such as lists, classes, dictionaries, tuples. First, in order to track container object, the needs of each container object maintains two additional pointers, It will be used to form a linked list container object, a pointer pointing to objects before and after the two containers, to facilitate insertion and deletion operations.

        • For example, existing in both cases

          A:
              a = [1,3]
              b = [2,4]
              a.append(b)
              b.append(a)
              del a
              del b
          
          
          B:
             a =[1,3]
              b = [2,4]
              a.append[b]
              b.append(a)
              del a 
        • Okey, now that is that. In marked cleared algorithm, there are two camps, one is the root list, another list is unreachable
          • For scenario A, DEL is not performed at the time of the original sentence, a, b are the reference count is 2 (init + append = 2), but after DEL finished, a, b to each other to reduce the number of references 1.a, b into circulation reference circle, then mark - sweep algorithm starts out to make trouble, and found one paragraph a, began to open the a, b reference ring (we proceed from a, because it has a reference to B, then B reference count Save 1; and B along the reference arrival, because there is a reference to a B, the same references a minus 1, thus completing the removal of the circular ring between the referenced object), found after removing the a, b circular reference variable to 0, a, b is processed as the list is made into unreachable out.
          • For scenario B, a simple ring to look after the reference count b is also 1, but take a ring, it is a 0, which has entered a time unreachable list has been sentenced to death, but this time, the root list has b, b will be referenced in the root list of a detected reference, is recovered if a, b to the cool, is pulled back so that a list of root
        • Why should these two lists?
          • The reason for Poucheng two lists, based on such a consideration: Now unreachable objects may be present in the root list, directly or indirectly referenced objects, these objects can not be recovered, once the process labeled, found that such object, it will be moved from the list unreachable root list, after the completion flag, unreachable rest of the list is the object of a veritable garbage objects, the next garbage collection only needs to be limited to the list in unreachable.
      • Generational recovery

        • Learn generational recovery, we must first look at the threshold of the GC, the so-called threshold value is a value of the critical point, as you run the program, Python interpreter to keep the newly created object, and because the reference count of zero is released out tracking the object. in theory, create a number == released. but if there is a reference cycle, will certainly lead to the creation> quantity released, when the difference is created with the number of releases has reached a predetermined threshold value, the generational collection mechanism on debut.
        • Generational recycling thought the subjects were divided into three generations (generation 0,1,2), 0 on behalf of childhood objects, a representative of the youth target, 2 on behalf of elderly subjects. According to a weak generation of hypotheses (the younger the subject the more likely to die, old objects usually live longer), the object is placed in the nascent generation 0, if the object survived in a gc 0 belt, then it is put in the first band inside (it was naturally). If a gc survived in the first generation of garbage collection, and he was placed inside the threshold value of the second-generation gc.set_threshold (threshold0 [, threshold1 [, threshold2]]) provided gc each generation garbage collection starting from after generation of generation 0 gc, if subtracting the number of assignments is larger than the number of release threshold0, it will zeroth generation garbage collection of objects gc checks from the first time after the generation gc, if the first passage number 0 is greater than thresholdl recovered gc garbage, it will for the first generation garbage collection object gc checks, from the time after the 2nd generation gc, gc garbage if the first generation is greater than the number recovered threshold2, it will to the second generation garbage collection object for inspection gc

Guess you like

Origin www.cnblogs.com/Yongzyw/p/11520483.html