Operation and Maintenance Series (10) -- Detailed Explanation of Python Garbage Collection Mechanism (GC)

content

Another article about the garbage collection mechanism of java: The garbage collection mechanism of
Java technology

1. Garbage collection mechanism

Garbage collection in Python is mainly based on reference counting, supplemented by generational collection. The pitfall of reference counting is the problem of circular references.
In Python, if an object's reference count is 0, the Python virtual machine reclaims the object's memory.

# encoding=utf-8
import gc
import time

class ClassA():
    def __init__(self):
        print 'object born,id:%s' % str(hex(id(self)))

    def __del__(self):
        print 'object del,id:%s' % str(hex(id(self)))


def f1():
    while True:
        c1 = ClassA()
        del c1

f1()

Executing f1() will output such results in a loop, and the memory occupied by the process will basically not change

object born,id:0x237cf58
object del,id:0x237cf58

c1=ClassA() will create an object and put it in the 0x237cf58 memory. The c1 variable points to this memory. At this time, the reference count of this memory is 1.
After del c1, the c1 variable no longer points to the 0x237cf58 memory, so the reference count of this memory is Subtract one, equal to 0, so the object is destroyed, and then the memory is released.

Situations that result in a reference count of +1

  • object is created, e.g. a=23
  • Object is referenced, e.g. b=a
  • Objects are passed as parameters to a function, such as func(a)
  • The object is stored as an element in a container, e.g. list1=[a,a]

Situations that result in a reference count of -1

  • The alias of the object is explicitly destroyed, e.g. del a
  • The alias of the object is given to the new object, e.g. a=24
  • An object leaves its scope, such as when the f function finishes executing, the local variables in the func function (the global variables do not)
  • The container in which the object resides is destroyed, or the object is removed from the container
import sys
def func(c, d):
    print 'in func function', sys.getrefcount(c) - 1


print 'init', sys.getrefcount(11) - 1
a = 11
print 'after a=11', sys.getrefcount(11) - 1
b = a
print 'after b=11', sys.getrefcount(11) - 1
func(11, 11)
print 'after func(a)', sys.getrefcount(11) - 1
list1 = [a, 12, 14]
print 'after list1=[a,12,14]', sys.getrefcount(11) - 1
a = 12
print 'after a=12', sys.getrefcount(11) - 1
del a
print 'after del a', sys.getrefcount(11) - 1
del b
print 'after del b', sys.getrefcount(11) - 1
# list1.pop(0)
# print 'after pop list1',sys.getrefcount(11)-1
del list1
print 'after del list1', sys.getrefcount(11) - 1

View the reference count of an object
sys.getrefcount(a) You can view the reference count of the a object, but it is 1 larger than the normal count, because a is passed in when calling the function, which will increase the reference count of a by 1

2. Circular references cause memory leaks

def f2():
    while True:
        c1=ClassA()
        c2=ClassA()
        c1.t=c2
        c2.t=c1
        del c1
        del c2
        print '111111111111111'
        print gc.garbage
        print '222222222222222'
        print gc.collect()  # 显式执行垃圾回收
        print '333333333333333'
        print gc.garbage

f2()

When f2() is executed, the memory occupied by the process will continue to increase.
After c1 and c2 are created, 0x237cf30 (memory corresponding to c1, recorded as memory 1), 0x237cf58 (memory corresponding to c2, recorded as memory 2) The reference counts of these two pieces of memory are both 1, execute c1.t=c2 and After c2.t=c1, the reference count of the two pieces of memory becomes 2.
After del c1, the reference count of the object in memory 1 becomes 1. Since it is not 0, the object in memory 1 will not be destroyed, so The reference number of the object in memory 2 is still 2. After del c2, similarly, the reference number of the object in memory 1 and the object in memory 2 is 1.
Although both of their objects can be destroyed, due to circular references, the garbage collector will not recycle them, so it will lead to memory leaks.

3. Garbage collection

def f3():
    # print gc.collect()
    c1=ClassA()
    c2=ClassA()
    c1.t=c2
    c2.t=c1
    del c1
    del c2
    print '11'*50
    print gc.garbage
    print '22'*50
    print gc.collect() #显式执行垃圾回收
    print '33'*50
    print gc.garbage
    # time.sleep(10)
if __name__ == '__main__':
    gc.set_debug(gc.DEBUG_LEAK) #设置gc模块的日志
    f3()

output:

object born,id:0x6b1af08L
gc: uncollectable <ClassA instance at 0000000006B1AF08>
object born,id:0x6c4e048L
1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
gc: uncollectable <ClassA instance at 0000000006C4E048>
[]
gc: uncollectable <dict 0000000006C48488>
2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
gc: uncollectable <dict 0000000006C48268>
4
3333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333
[<__main__.ClassA instance at 0x0000000006B1AF08>, <__main__.ClassA instance at 0x0000000006C4E048>, {'t': <__main__.ClassA instance at 0x0000000006C4E048>}, {'t': <__main__.ClassA instance at 0x0000000006B1AF08>}]
  • Garbage-collected objects will be placed in the gc.garbage list
  • gc.collect() will return the number of unreachable objects, 4 is equal to two objects and their corresponding dict
  • There are three situations that trigger garbage collection:

    1.调用gc.collect(),
    2.当gc模块的计数器达到阀值的时候。
    3.程序退出的时候
    

Four. Analysis of common functions of gc module

Automatic garbage collection mechanism of gc module The gc module
must be imported, and is_enable()=True will start automatic garbage collection.
The main function of this mechanism is to find and deal with unreachable garbage objects.
Garbage collection = garbage checking + garbage collection
In Python, the method of generational collection is used. Divide the object into three generations. At the beginning, when the object is created, it is placed in the first generation. If the modified object survives the garbage check of one generation, it will be placed in the second generation. Similarly, in the first and second generation If the object survives the garbage check, it will be put into three generations.

In the gc module, there will be a counter with a list of length 3, which can be obtained by gc.get_count().
For example (488,3,0), where 488 refers to the distance from the last generation of garbage checking, the number of memory allocated by Python minus the number of freed memory, note that it is memory allocation, not reference count increase. E.g:

print gc.get_count()  # (590, 8, 0)
a = ClassA()
print gc.get_count()  # (591, 8, 0)
del a
print gc.get_count()  # (590, 8, 0)

3 refers to the number of second-generation garbage inspections and first-generation garbage inspections since the last time. Similarly, 0 refers to the number of third-generation garbage inspections and second-generation garbage inspections since the last time.

The gc module has an automatic garbage collection threshold, that is, a tuple of length 3 obtained through the gc.get_threshold function, for example (700, 10, 10)
every time the counter increases, the gc module will check the increased Whether the count reaches the threshold number, if so, the corresponding algebraic garbage check is performed, and the counter is reset.
For example , assuming the threshold is (700, 10, 10):

  • When the counter increases from (699, 3, 0) to (700, 3, 0), the gc module will execute gc.collect(0), that is, check the garbage of the generation object, and reset the counter to (0, 4, 0) )
  • When the counter increases from (699,9,0) to (700,9,0), the gc module will execute gc.collect(1), that is, check the garbage of the first and second generation objects, and reset the counter to (0, 0,1)

There is also a comparison of python ruby ​​garbage collection mechanism for a clearer understanding

Python garbage collection mechanism - perfect explanation!

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325599939&siteId=291194637