PHP's garbage collection mechanism (detailed explanation of the whole network)

concept:

PHP's garbage collection mechanism is automatic, and it is implemented through the built-in garbage collector (Garbage Collector). When a PHP object is no longer referenced, it becomes garbage. The garbage collector periodically scans all objects in memory, marks objects that are not referenced as garbage, and frees the memory space they occupy so that other objects can use it.

PHP's garbage collection mechanism uses a reference counting algorithm to track object references. Each object has a reference counter, which records the number of times the object is currently referenced. When an object is assigned to a variable, its reference counter is incremented by 1; when a variable no longer refers to the object, its reference counter is decremented by 1. When the reference counter drops to 0, the object becomes garbage, and the garbage collector releases the memory it occupies.

PHP's garbage collection mechanism is automatic, and programmers do not need to manually manage memory. However, if there are circular references in the program, the garbage collector cannot release these objects. In order to avoid this from happening, PHP provides a method of manually dereferencing, that is, assigning the object to null, so that the reference counter of the object can be reduced to 0 and released by the garbage collector.

1. Basic knowledge of reference counting

  • Each PHP variable exists in a variable container called zval.
  • A zval variable container, in addition to containing the type and value of the variable, also includes two bytes of additional information.
  • The first is is_ref, which is a bool value used to identify whether this variable belongs to the reference set. Through this byte, the php engine can distinguish ordinary variables from reference variables. Since php allows users to use custom references by using &, there is also an internal reference counting mechanism in the zval variable container to optimize memory usage.
  • The second extra byte is refcount, which is used to indicate the number of variables pointing to this zval variable container.
  • All symbols exist in a symbol table, each of which has a scope (scope), those main scripts (such as: scripts requested by the browser) and each function or method also have a scope.

2. Generate zval container

  • When a variable is assigned a constant value, a zval variable container is generated
  • If Xdebug is installed, you can view these two values ​​​​by xdebug_debug_zval()
<?php
$a = "new string";
xdebug_debug_zval('a');

//结果
a: (refcount=1, is_ref=0)='new string'

3. Increase the reference count of zval

Assigning a variable to another variable will increase the reference count


<?php
$a = "new string";
$b = $a;
xdebug_debug_zval( 'a' );

//结果
a: (refcount=2, is_ref=0)='new string'

Fourth, reduce the zval reference count

  • Use unset() to reduce the number of references 
  • The variable container containing the type and value is deleted from memory
<?php
$a = "new string";
$c = $b = $a;
xdebug_debug_zval( 'a' );
unset( $b, $c );
xdebug_debug_zval( 'a' );

//结果
a: (refcount=3, is_ref=0)='new string'
a: (refcount=1, is_ref=0)='new string'

5. Composite type zval container

  • is different from a value of type scalar
  • Variables of type array and object store their members or attributes in their own symbol table
  • This means that the following example will generate three zval variable containers
  • The three zval variable containers are: a, meaning and number


<?php
$a = array( 'meaning' => 'life', 'number' => 42 );
xdebug_debug_zval( 'a' );

//结果
a: (refcount=1, is_ref=0)=array (
'meaning' => (refcount=1, is_ref=0)='life',
'number' => (refcount=1, is_ref=0)=42
)

6. Increase the reference count of composite types

Add an existing element to the array


<?php
$a = array( 'meaning' => 'life', 'number' => 42 );
$a['life'] = $a['meaning'];
xdebug_debug_zval( 'a' );

//结果
a: (refcount=1, is_ref=0)=array (
'meaning' => (refcount=2, is_ref=0)='life',
'number' => (refcount=1, is_ref=0)=42,
'life' => (refcount=2, is_ref=0)='life'
)

7. Reduce the reference count of composite types

  • remove an element from the array
  • It is similar to removing a variable from scope.
  • After deletion, the "refcount" value of the container where this element in the array is located is decremented

<?php
$a = array( 'meaning' => 'life', 'number' => 42 );
$a['life'] = $a['meaning'];
unset( $a['meaning'], $a['number'] );
xdebug_debug_zval( 'a' );

//结果
a: (refcount=1, is_ref=0)=array (
'life' => (refcount=1, is_ref=0)='life'
)

8. Special circumstances

Things get interesting when we add an array itself as an element of this array 

As above, calling unset on a variable will delete the symbol, and the number of references in the variable container it points to will also be reduced by 1


<?php
$a = array( 'one' );
$a[] = &$a;
xdebug_debug_zval( 'a' );

//结果
a: (refcount=2, is_ref=1)=array (
0 => (refcount=1, is_ref=0)='one',
1 => (refcount=2, is_ref=1)=...
)

9. The problem of cleaning up variable containers

Although there is no longer any symbol in a certain scope pointing to this structure (that is, the variable container), since the array element "1" still points to the array itself, this container cannot be cleared. Since there is no other symbol pointing to it, the user has no way to clean up this structure, resulting in a memory leak. Fortunately, php will clear this data structure at the end of script execution, but before php clears, it will consume a lot of memory. It's okay if the above situation happens only once or twice, but if there are thousands or even hundreds of thousands of memory leaks, this is obviously a big problem

10. Recovery cycle

Like the reference counting memory mechanism used in the previous php, it cannot handle circular reference memory leaks

In php 5.3.0, the synchronization algorithm is used to deal with this memory leak problem

If a reference count is incremented, it will continue to be used and of course no longer in the garbage.

If the reference count decreases to zero, the variable container will be cleared (free)

That is, a garbage cycle occurs only when the reference count decreases to a non-zero value

In a garbage cycle, find out which parts are garbage by checking whether the reference count is decremented by 1, and checking which variable containers have zero reference counts

11. Analysis of Recycling Algorithm

  • To avoid having to check all reference counts for possibly reduced garbage cycles
  • This algorithm puts all possible roots (possible roots are zval variable containers) in the root buffer (marked in purple, called suspected garbage), which can ensure that each possible garbage root (possible root) garbage root) appears only once in the buffer. Garbage collection is performed on all the different variable containers inside the root buffer only when the root buffer is full. Look at step A in the picture above.
  • In step B, simulate deletion of each purple variable. When simulating deletion, it is possible to subtract "1" from the reference count of ordinary variables that are not purple. If the reference count of an ordinary variable becomes 0, perform another simulated deletion on this ordinary variable. Each variable can only be simulated deleted once, and it will be grayed out after simulated deletion
  • In step C, the simulation restores each purple variable. Restoration is conditional. When the reference count of the variable is greater than 0, it will be simulated and restored. Similarly, each variable can only be restored once, and it will be marked as black after restoration, which is basically the inverse operation of step B. In this way, the rest of the blue nodes that cannot be recovered are the blue nodes that should be deleted. After traversing them in step D, they are really deleted.

12. Performance Considerations

There are two main areas that have an impact on performance

The first is the savings in memory footprint

The other is the increased time it takes for the garbage collection mechanism to release leaked memory

13. Conclusion of Garbage Collection Mechanism

The garbage collection mechanism in PHP only has an increase in time consumption when the cycle collection algorithm actually runs. But in normal (smaller) scripts there should be no performance impact at all.

However, the memory savings will allow more such scripts to run on your server at the same time in the case of normal scripts that have recycling mechanisms running. Because the total memory used has not reached the upper limit.

This benefit is especially noticeable in long-running scripts, such as long-running test suites or daemon scripts.

Guess you like

Origin blog.csdn.net/MrWangisgoodboy/article/details/130148349