Some of the underlying PHP knowledge

In this part relatively dry, please bring your own mineral water

Article 6 points to explain topics

  1. PHP operational mechanisms and principles
  2. PHP underlying variable data structure
  3. COW features of PHP assignment by value in
  4. PHP garbage collection mechanism
  5. PHP array in the underlying analysis
  6. PHP array function classification

PHP operational mechanisms and principles

Scan -> parse -> compile -> Execution -> Output

Steps

  • scanning

The code syntax and lexical analysis, the contents of a cut fragment (token)

  • Resolve

The code snippet screened out spaces and other comments, the rest of the token will turn into meaningful expressions

  • Compile

The expression compiled into intermediate code (opcode)

  • carried out

An intermediate code execution section

  • Export

The execution result is output to a buffer

Code Cutting

$code = <<<EOF
<?php
echo 'hello world'l;
$data = 1+1;
echo $data;
EOF;

print_r(token_get_all($code));

Results of the

Array
(
    [0] => Array
        (
            [0] => 376
            [1] => <?php

            [2] => 1
        )

    [1] => Array
        (
            [0] => 319
            [1] => echo
            [2] => 2
        )

    [2] => Array
        (
            [0] => 379
            [1] =>
            [2] => 2
        )

    [3] => Array
        (
            [0] => 318
            [1] => 'hello world'
            [2] => 2
        )

    [4] => Array
        (
            [0] => 310
            [1] => l
            [2] => 2
        )

    [5] => ;
    [6] => Array
        (
            [0] => 379
            [1] =>

            [2] => 2
        )

    [7] => =
    [8] => Array
        (
            [0] => 379
            [1] =>
            [2] => 3
        )

    [9] => Array
        (
            [0] => 308
            [1] => 1
            [2] => 3
        )

    [10] => +
    [11] => Array
        (
            [0] => 308
            [1] => 1
            [2] => 3
        )

    [12] => ;
    [13] => Array
        (
            [0] => 379
            [1] =>

            [2] => 3
        )

    [14] => Array
        (
            [0] => 319
            [1] => echo
            [2] => 4
        )

    [15] => Array
        (
            [0] => 379
            [1] =>
            [2] => 4
        )

    [16] => ;
)

The above information can be observed three

  1. Token id such as spaces carriage returns are 379
  2. token strings
  3. Line number

Token id is an internal code corresponding to Zend token, as defined inzend_language_parser.h

Improve the efficiency of PHP

  1. Compressed code, removing unwanted comments and whitespace characters (jquery.min.js)
  2. Try to use the PHP built-in functions or extended functions
  3. With apc / xcache / opcache such as the PHP opcode cache
  4. Cache complex and time-consuming operation result
  5. Can not synchronize asynchronous processing, such as sending e-mail
  • HHVM why fast

Convert virtual machine (similar to java) directly into the PHP binary byte code to run, do not always go to resolve when executed.

PHP underlying variable data structure

Use zval storage structure, the following code Zend/zend.his defined

typedef union _zvalue_value 
{
    /* 下面定义描述了PHP的8大数据类型 */
    long lval;               // 长整型 布尔型
    double dval;             // 浮点型 
    struct {                 // 字符串型
        char *val;
        int len;             // strlen 返回这个值
    } str;                   // NULL 类型表示本身为空 
    HashTable *ht;           // 数组使用哈希表实现 
    zend_object_value obj;   // 对象类型 
} zvalue_value;

struct  _zval_struct
{
    zvalue_value value;     /* 变量的值 */
    zend_uint refcount__gc;
    zend_uchar type;        /* 变量的类型 */
    zend_uchar is_ref__gc
};

typedef struct _zval_struct zval;

Variable type definitions, the following code Zend/zend_types.his defined

typedef unsigned int zend_uint;
typedef unsigned char zend_uchar;

PHP eight major types of data through unified zvalue_valueCommonwealth of storage

联合体自身为空         描述 null 
long                  描述 int bool 
double                描述 float
str                   描述 string
HashTable             描述 数字数组和关联数组
zend_object_value     描述 对象和资源

PHP variable types describe the use zend_uchar typeDescription

#define IS_NULL         0
#define IS_LONG         1
#define IS_DOUBLE       2
#define IS_BOOL         3
#define IS_ARRAY        4
#define IS_OBJECT       5
#define IS_STRING       6
#define IS_RESOURCE     7
#define IS_CONSTANT     8
#define IS_CONSTANT_ARRAY 9

For example, $a=3the structure follows (pseudocode)

struct {
    zvalue_value = 3;
    refcount__gc = 1;
    type = IS_LONG;
    is_ref__gc = 0;
}

$a Like a pointer to a structure above

COW features of PHP assignment by value in

In the _zval_structdata structure as well as the following two members

  • zend_uint refcount__gc represents cited many times, each time reference +1
  • zend_uchar is_ref__gc represent ordinary variable or reference variable

The following quote by writing code to understand the mechanism

Here I am using php5.4, you need to install to view the variable reference xdebug

Note the use of php7.2 test when the number of references would have been 0

Installation xdebug Download

Compiled xdebug.so

yum -y install php-devel
tar xf xdebug-2.8.0alpha1.tgz
cd xdebug-2.8.0alpha1
phpize
find /usr/ -name "php-config"
./configure  --with-php-config=/usr/bin/php-config
make && make install
ls /usr/lib64/php/modules/

Configuration xdebug

php --ini
echo 'zend_extension=/usr/lib64/php/modules/xdebug.so' >> /etc/php.ini
systemctl restart php72-php-fpm.service
php -m | grep xdebug

Write test code

$a = 3;
xdebug_debug_zval('a');

Export

a: (refcount=1, is_ref=0)=3

  • refcount Reference number 1
  • is_ref 0 indicates normal variable
  • =3 3 represents the value

Start quote

$a = 3;
$b = $a;

xdebug_debug_zval('a');
xdebug_debug_zval('b');

Export

a: (refcount=2, is_ref=0)=3
b: (refcount=2, is_ref=0)=3


Given a new value

$a = 3;
$b = $a;
$b = 5;

xdebug_debug_zval('a');
xdebug_debug_zval('b');

Export

a: (refcount=1, is_ref=0)=3
b: (refcount=1, is_ref=0)=5


Delivery Address

$a = 3;
$b = &$a;
xdebug_debug_zval('a');
xdebug_debug_zval('b');

Export

a: (refcount=2, is_ref=1)=3
b: (refcount=2, is_ref=1)=3

is_ref transferred from the variable to a common variable reference variable


Given a new value

$a = 3;
$b = &$a;
$c = $a;

$b = 5;
xdebug_debug_zval('a');
xdebug_debug_zval('b');
xdebug_debug_zval('c');

a: (refcount=2, is_ref=1)=5
b: (refcount=2, is_ref=1)=5
c: (refcount=1, is_ref=0)=3


to sum up

  • The value of the variable is passed by reference between assignment form, without the need to open up new space, save resources

  • When the value of a variable is changed, it will copy to save the new value, dereference, known as copy on write (COW)

  • Reference variable does not trigger COW

PHP garbage collection mechanism

What is garbage

Shanghai people: What rubbish you count?

If a zval no variable references it, then it is rubbish

?: (refcount=0, is_ref=0)=5

Why would clean up the garbage?

Some people say that would destroy all php variables at the end of the thread, close all handles resource, not an automatic thing, why should clean up

  • If php short handle multiple large files (such as movies 1G), processed without recycling continues with the next one will be a memory overflow
  • If the php script is a daemon or long-running, no garbage, slowly accumulated will be a memory overflow

How to clean up garbage

  1. Looking for garbage
  2. Remove
  • Looking for garbage

By get_defined_varsviewing all defined variables

Underlying code zend_globals.hdefines two hash tables to store all variables

struct _zend_executor_globals {
    ...
    HashTable *active_symbol_table; //局部变量符号表
    HashTable symbol_table;         //全局变量符号表
    ...
}

After finding all defined variables, find out what variable reference number 0

struct _zval_struct{
    ...
    zend_uint refcount__gc;
    zend_uchar is_ref__gc;
    ...
}
  • Rubbish

As above, the variable 0 refcount__gc clearance, the idea is to practice before PHP5.2 version

After PHP5.3 use 引用计数系统中同步周期回收algorithms to remove

In fact, the new algorithm is based on refcount__gc to recover, so why use a new algorithm for it?

We know refcount__gc 0 must be garbage

But not all are rubbish refcount__gc 0

Refcount__gc also not spam, the following experiment may produce 0 0 spam


one example

$a = ['a'];
$a[] = &$a; //引用自己
xdebug_debug_zval('a');

Export

a: (refcount=2, is_ref=1)=array (
0 => (refcount=1, is_ref=0)='a',
1 => (refcount=2, is_ref=1)=...
)

Second element: the representative recursive ..., reference number 2, is a pointer reference variable

Photo provided by the official

image


At this point delete $ a

$a = ['a'];
$a[] = &$a;

unset($a);
xdebug_debug_zval('a'); 

Output
a: no such symbol

Since $ a is deleted, so xdebug print does not come out, so this time the theory is structured as follows

(refcount=1, is_ref=1)=array (
0 => (refcount=1, is_ref=0)='a',
1 => (refcount=1, is_ref=1)=...
)

image

At this point this zval has no symbol (symbol) cited, but because of its own quote myself refcount is 1, so it is a wonderful work of garbage

For the end of this case php script, automatically cleared when the end will take up space

So garbage ideas versions prior to 5.2 can not cover this case


Reference counting system synchronization recovery cycle algorithm (Concurrent Cycle Collection in Reference Counted System)

Continue with the above code as an example

The new algorithm description:

The $aas suspected junk variable, to simulate the deletion (refcount--), and then simulate the recovery, restoration conditions there are other variables referenced simulated recovery (refcount ++) if the value

Such success is not able to restore the garbage, it can be deleted.

Garbage wonderful example above:

(refcount=1, is_ref=1)=array (
0 => (refcount=1, is_ref=0)='a',
1 => (refcount=1, is_ref=1)=...
)

After deleting simulation becomes:

(refcount=0, is_ref=1)=array (
0 => (refcount=0, is_ref=0)='a',
1 => (refcount=0, is_ref=1)=...
)

Then simulate recovery:

Because there is no such symbol $ a similar point to take the zval, it is not restored to

When Clear

Through the above algorithm could be spam will be stored in an area (RCPs), only full of garbage will be cleared immediately. Note that the premise is open garbage collection

Open garbage collection in two ways

  1. Under the php.ini zend.enable_gc = Onenabled by default

  2. By gc_enable()and gc_disable()opened or closed garbage collection

It may be used as gc_collect_cycles()a function of the recovery period enforced

Finally, said so much, in fact, only need to understand the principles involved, the whole process does not require PHP developers to participate, just call gc_enable () or gc_collect_cycles () can be automatically recovered

PHP array in the underlying analysis

To review the array of properties

The key features of PHP array

$arr = [
    1 => 'a',
    '1' => 'b',
    1.5 => 'c',
    true => 'd',
];

print_r($arr);

Array
(
[1] => d
)

key can be either integer or string

value can be any type

key has the following features

  • Numeric string will be converted to an integer '1' => 1
  • Floating-point and integer boolean converted into 1.3 = "1
  • will be treated as an empty string null null => ''
  • Key names can not use objects and arrays
  • The front cover behind the same key name

Access array elements

  1. $arr[key]
  2. $arr{key}

After the 5.4 version can be used as follows

function getArr(){ return [1,2,3,4]; }
echo getArr()[2];

Delete array element

$a = [1,2,3,4];
foreach ($a as $k => $v) {
    unset($a[$k]);
}

$a[] = 5;

print_r($a);

Array
(
[4] => 5
)

  • Delete does not reset the index

Array traversal

  1. for
  2. foreach
  3. array_walk
  4. array_map
  5. current 和 next

Achieve internal array

Implementation uses two structures HashTableandbucket

image

  • What is HashTable

Hash tables, data structures access memory storage location directly by keyword.

Keyword is calculated by the hash function, to obtain the mapping table such that the position of: search, insert, modify, delete, are completed in O (1)

image

The following code Zend/zend_types.h

typedef struct _zend_array HashTable;

struct _zend_array {
    zend_refcounted_h gc;
    union {
        struct {
            ZEND_ENDIAN_LOHI_4(
                zend_uchar    flags,
                zend_uchar    nApplyCount,
                zend_uchar    nIteratorsCount,
                zend_uchar    consistency)
        } v;
        uint32_t flags;
    } u;
    uint32_t          nTableMask; 
    Bucket           *arData;
    uint32_t          nNumUsed;
    uint32_t          nNumOfElements;
    uint32_t          nTableSize;           
    uint32_t          nInternalPointer;
    zend_long         nNextFreeElement;
    dtor_func_t       pDestructor;
};

Older structures

typedef struct _hashtable {
    uint nTableSize;
    uint nTableMask;
    uint nNumOfElements;
    ulong nNextFreeElement;
    Bucket *pInternalPointer;
    Bucket *pListHead;
    Bucket *pListTail;
    Bucket **arBuckets;
    unsigned char nApplyCount;
};
member Explanation
nTableSize Bucket size, a minimum of 8 to 2x increase
nTableMask Index Tuning nTableSize-1
nNumOfElements The number of elements using the count () function returns the direct
nNextFreeElement The next index position using foreach
pInternalPointer The current iteration pointer, foreach faster than for the reason, reset current function uses
pListHead 存储数组头部指针
pListTail 存储数组尾部指针
arBuckets 实际存储容器
arData Bucket数据
nApplyCount 记录被递归次数,防止死循环递归
typedef struct bucket
{
    ulong h;
    uint nKeyLength;
    void *pData;
    void *pDataPtr;
    struct bucket *pListNext;
    struct bucket *pListLast;
    struct bucket *pNext;
    struct bucket *pLast;
    const char *arKey;
};
成员 说明
h 对char *key进行hash后的值,或是用户指定数字索引值
nKeyLength 哈希关键字长度,若为索引数字则为0
pData 指向value 一般是用户数据的副本,若为指针数据则指向指针
pDataPtr 如果是指针数据,指针会指向真正value,上面指向此
pListNext 整个hash表下个元素
pListLast 整个hash表上个元素
pNext 同一个hash的下一个元素
pLast 同一个hash的上一个元素
arKey 保存当前key对应的字符串

image

foreach 遍历先从 HashTable 的 pListHead -> pListNext

pNextpLast 用于hash冲突同一个hash不同个bucket之间指针

PHP数组函数分类

建议体验一下下面的函数,不用记住,只是留个印象,当你需要用的时候会联想起来的,而不用自己去实现

遍历

  • prev
  • next
  • current
  • end
  • reset
  • each

排序

  • sort
  • rsort
  • asort
  • ksort
  • krsort
  • uasort
  • uksort

查找

  • in_array
  • array_search
  • array_key_exists

分合

  • array_slice
  • array_splice
  • implode
  • explode
  • array_combine
  • array_chunk
  • array_keys
  • array_values
  • array_columns

集合

  • array_merge
  • array_diff
  • array_diff_*
  • array_intersect
  • array_intersect_*

队列/栈

  • array_push
  • array_pop
  • array_shift

其他

  • array_fill
  • array_flip
  • array_sum
  • array_reverse

Reproduced, please indicate the source https://www.cnblogs.com/demonxian3/p/11327522.html

Guess you like

Origin www.cnblogs.com/demonxian3/p/11327522.html