Pre-increment does not return lvalue in PHP explored by Zend engine

First of all, our general understanding of "lvalue" is an identifier that can appear on the left side of an assignment operator, that is, it can be assigned a value. This may not be very precise, and the definition of lvalue in different languages ​​is not the same. Here we are discussing the pre-increment (and decrement) operator scenario, saying that pre-increment needs to return an lvalue, or more concisely, to return the variable itself, or a reference to itself.

1. Analysis of the problem

Encountered this problem in PHP, initially because of writing code like the following:

<?php
function func01(&$a) {
    echo $a . PHP_EOL;
    $a += 10;
}
$n = 0;
func01(++$n);
echo $n . PHP_EOL;

According to the experience of writing C++, the above code should print 1 and 11, but PHP unexpectedly prints 1 and 1. To find out, I used the zendump_opcodes() function in the zendump extension to print out the OPCODES of the code above:

[root@c962bf018141 php-7.2.2]# sapi/cli/php -f ~/php/func_arg_pre_inc_by_ref.php 
1
1
op_array("") refcount(1) addr(0x7f7445c812a0) vars(1) T(5) filename(/root/php/func_arg_pre_inc_by_ref.php) line(1,12)
OPCODE                             OP1                                OP2                                RESULT                             EXTENDED                           
SEND_NOP                                                                                                                                                                       
ZEND_ASSIGN                        $n                                 0                                                                                                        
ZEND_INIT_FCALL                    128                                "func01"                                                              1                                  
ZEND_PRE_INC                       $n                                                                    #var1                                                                 
ZEND_SEND_VAR_NO_REF # var1                               1                                                                                                        
ZEND_DO_UCALL                                                                                                                                                                  
ZEND_CONCAT                        $n                                 "\n"                               #tmp3                                                                 
ZEND_ECHO                          #tmp3                                                                                                                                       
ZEND_INIT_FCALL                    80                                 "zendump_opcodes"                                                     0                                  
ZEND_DO_ICALL                                                                                                                                                                  
ZEND_RETURN                        1                                                                                                                                     

From the perspective of OPCODES, the main problem should be on the ZEND_PRE_INC instruction, because its return value is #var1 instead of $n, because the variable layout on the OPCODE and virtual machine stack is determined at the compilation stage, that is to say Zend The engine does not use $n itself as a return value at compile time. By looking at the specific implementation of the ZEND_PRE_INC and ZEND_PRE_DEC instructions in zend_vm_def.h, it can be found that #var1 returned at runtime is not a reference to $n, but a value copy using the ZVAL_COPY_VALUE and ZVAL_COPY macros.

Take a look at this 2012 bug: https://bugs.php.net/bug.php?id=62778 , which also appears to have been submitted by a developer with C++ experience. It was PHP 5.4 at that time, and it is still in the Open state. It seems that the official is not going to fix this bug, maybe it is not a bug. Because PHP has no standardization committee and no syntax white paper, it's really hard to say whether this is a bug or not. Because of this, sometimes it is difficult to find authoritative information when encountering some unexpected phenomena, so we can only study its realization.

Second, try to modify
Because I am not used to this implementation, I tried to modify it myself. I made the following modifications to the two instructions ZEND_PRE_INC and ZEND_PRE_DEC in the source code of PHP 7.2.2. The main idea is that if the left operand is not a reference type If so, convert it to a reference type (the ZVAL_MAKE_REF macro will tell), and let the instruction's result operand refer to the left operand:

[root@c962bf018141 Zend]# diff zend_vm_def.h zend_vm_def.h.bak
1211c1211
< zval *var_ptr, * varptr;
---
> zval * var_ptr;
1213c1213
<varptr = var_ptr = GET_OP1_ZVAL_PTR_PTR_UNDEF (BP_VAR_RW);
---
>       var_ptr = GET_OP1_ZVAL_PTR_PTR_UNDEF(BP_VAR_RW);
1218,1219c1218
<                       ZVAL_MAKE_REF(varptr);
<                       ZVAL_COPY(EX_VAR(opline->result.var), varptr);
---
>                       ZVAL_COPY_VALUE(EX_VAR(opline->result.var), var_ptr);
1241,1242c1240
<               ZVAL_MAKE_REF(varptr);
<               ZVAL_COPY(EX_VAR(opline->result.var), varptr);
---
> ZVAL_COPY (EX_VAR (opline-> result. Var ), var_ptr);
1253c1251
< zval *var_ptr, * varptr;
---
> zval * var_ptr;
1255c1253
<varptr = var_ptr = GET_OP1_ZVAL_PTR_PTR_UNDEF (BP_VAR_RW);
---
>       var_ptr = GET_OP1_ZVAL_PTR_PTR_UNDEF(BP_VAR_RW);
1260,1261c1258
<                       ZVAL_MAKE_REF(varptr);
<                       ZVAL_COPY(EX_VAR(opline->result.var), varptr);
---
>                       ZVAL_COPY_VALUE(EX_VAR(opline->result.var), var_ptr);
1283,1284c1280
<               ZVAL_MAKE_REF(varptr);
<               ZVAL_COPY(EX_VAR(opline->result.var), varptr);
---
> ZVAL_COPY (EX_VAR (opline-> result. Var ), var_ptr);

This modification is mainly inspired by the way the Zend engine implements global and static variables, which is very similar to how we overload the ++ operator for a class in C++, and finally return a reference to itself. I also tried to use INDIRECT type pointer, but it will cause core dump, it seems that INDIRECT type is only used in some specific scenarios in Zend engine, not as widely supported as reference type.

After the modification is completed, use zend_vm_gen.php to regenerate the code and make successfully, and then go back to execute the above code, and it does output 1 and 11 as expected:

[root@c962bf018141 php-7.2.2]# sapi/cli/php -f ~/php/func_arg_pre_inc_by_ref.php 
1
11
vars(1): {
  $n ->
  zval(0x7fd182c1d080) -> reference(1) addr(0x7fd182c5f078) zval(0x7fd182c5f080) : long(11)
}

Using the zendump_vars() function in the zendump extension to print local variables, you can find that $n is indeed converted to a reference type. Third, verify the modification Now the worry is whether such modification will introduce any bugs, especially whether PHP will have any features that depend on the implementation that does not return lvalues. I executed make test under the modified and unmodified PHP projects respectively, and compared the results, and found that there are indeed two tests that failed:


Check key execution order with new. [tests/lang/engine_assignExecutionOrder_007.phpt]
Execution ordering with comparison operators. [tests/lang/engine_assignExecutionOrder_009.phpt]

Further analysis of the failed test code shows that both tests use multiple pre-increment operators within the same statement, as follows:

<?php
$a[2][3] = 'stdClass';
$a[$i=0][++$i] = new $a[++$i][++$i];
print_r($a);

$o = new stdClass;
$o->a = new $a[$i=2][++$i];
$o->a->b = new $a[$i=2][++$i];
print_r($o);

Use the zendump_opcodes() function again to print out the OPCODES:

[root@c962bf018141 php-7.2.2]# sapi/cli/php -f php/testcase007.php 
op_array("") refcount(1) addr(0x7fba8347f2a0) vars(3) T(36) filename(/root/php/testcase007.php) line(1,13)
OPCODE                             OP1                                OP2                                RESULT                             EXTENDED                           
ZEND_INIT_FCALL                    80                                 "zendump_opcodes"                                                     0                                  
ZEND_DO_ICALL                                                                                                                                                                  
ZEND_FETCH_DIM_W                   $a                                 2                                  #var1                                                                 
ZEND_ASSIGN_DIM                    #var1                              3                                                                                                        
ZEND_OP_DATA                       "stdClass"                                                                                                                                  
ZEND_ASSIGN                        $i                                 0                                  #var3                                                                 
ZEND_PRE_INC                       $i                                                                    #var5                                                                 
ZEND_PRE_INC                       $i                                                                    #var7                                                                 
ZEND_PRE_INC                       $i                                                                    #var9                                                                 
ZEND_FETCH_DIM_R                   $a                                 #var7                              #var8                                                                 
ZEND_FETCH_DIM_R                   #var8                              #var9                              #var10                                                                
ZEND_FETCH_CLASS                                                      #var10                             #var11                             
ZEND_NEW                           #var11                                                                #var12                             0                                  
ZEND_DO_FCALL                                                                                                                                                                  
ZEND_FETCH_DIM_W                   $a                                 #var3                              #var4                                                                 
ZEND_ASSIGN_DIM                    #var4                              #var5                                                                                                    
SEND_ON_DATA #var12                                                                                                                                      
ZEND_INIT_FCALL                    96                                 "print_r"                                                             1                                  
ZEND_SEND_VAR                      $a                                 1                                                                                                        
ZEND_DO_ICALL                                                                                                                                                                  
ZEND_NEW                           "stdClass"                                                            #var15                             0                                  
ZEND_DO_FCALL                                                                                                                                                                  
ZEND_ASSIGN $o #var15                                                                                                   
ZEND_ASSIGN                        $i                                 2                                  #var19                                                                
ZEND_PRE_INC                       $i                                                                    #var21                                                                
ZEND_FETCH_DIM_R                   $a                                 #var19                             #var20                                                                
ZEND_FETCH_DIM_R                   #var20                             #var21                             #var22                                                                
ZEND_FETCH_CLASS                                                      #var22                             #var23                             
ZEND_NEW                           #var23                                                                #var24                             0                                  
ZEND_DO_FCALL                                                                                                                                                                  
ZEND_ASSIGN_OBJ                    $o                                 "a"                                                                                                      
SEND_ON_DATA #var24                                                                                                                                      
ZEND_ASSIGN                        $i                                 2                                  #var28                                                                
ZEND_PRE_INC                       $i                                                                    #var30                                                                
ZEND_FETCH_DIM_R                   $a                                 #var28                             #var29                                                                
ZEND_FETCH_DIM_R                   #var29                             #var30                             #var31                                                                
ZEND_FETCH_CLASS                                                      #var31                             #var32                             
ZEND_NEW                           #var32                                                                #var33                             0                                  
ZEND_DO_FCALL                                                                                                                                                                  
ZEND_FETCH_OBJ_W                   $o                                 "a"                                #var26                                                                
ZEND_ASSIGN_OBJ                    #var26                             "b"                                                                                                      
SEND_ON_DATA #var33                                                                                                                                      
ZEND_INIT_FCALL                    96                                 "print_r"                                                             1                                  
ZEND_SEND_VAR                      $o                                 1                                                                                                        
ZEND_DO_ICALL                                                                                                                                                                  
ZEND_RETURN                        1                                                                                                                                           

The ZEND_PRE_INC instruction immediately after ZEND_ASSIGN and the three ZEND_PRE_INC instructions immediately above are enough to illustrate the problem. It means that when the Zend engine compiles, it first evaluates the array subscripts in the square brackets, in the order from left to right, and then evaluates the outer expression. If the pre-increment operator returns a variable reference, execute the pre-increment instruction immediately after the assignment as above, or execute 3 pre-increment instructions in a row, the resulting operands all refer to the same variable, and the value is the last The value after one increment, so the subsequent logic is naturally wrong. As for why the Zend engine is implemented in this way, I don't know yet, and I guess it may be to make the parser easier to implement.

Summary
In order to allow the pre-increment and decrement operators to return variable references, and to make the above features work properly, it is necessary to modify the Zend engine's compiler to generate instruction code in a reasonable order for the above scenario. But modifying the compiler is too much involved, and it's harder to predict how many problems it will cause. Therefore, the exploration of this problem has come to an end for the time being.

Even if the pre-increment and decrement operators can return variable references, the applicable scenarios are very limited. For example, statements like the following cannot be compiled in PHP at all. If the compiler is not modified, the return cannot be truly reflected. The convenience of citations at the grammatical level. Perhaps we can also think that there is no need to introduce too much complexity for this not very common syntax.

$b = &++$a;
++$a += 10;
++(++$b);

Finally, welcome to my homepage .

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324755925&siteId=291194637