In this part relatively dry, please bring your own mineral water
Article 6 points to explain topics
- PHP operational mechanisms and principles
- PHP underlying variable data structure
- COW features of PHP assignment by value in
- PHP garbage collection mechanism
- PHP array in the underlying analysis
- PHP array function classification
PHP operational mechanisms and principles
Scan -> parse -> compile -> Execution -> Output
Steps
- scanning
The code syntax and lexical analysis, the contents of a cut fragment (token)
- Resolve
The code snippet screened out spaces and other comments, the rest of the token will turn into meaningful expressions
- Compile
The expression compiled into intermediate code (opcode)
- carried out
An intermediate code execution section
- Export
The execution result is output to a buffer
Code Cutting
$code = <<<EOF
<?php
echo 'hello world'l;
$data = 1+1;
echo $data;
EOF;
print_r(token_get_all($code));
Results of the
Array
(
[0] => Array
(
[0] => 376
[1] => <?php
[2] => 1
)
[1] => Array
(
[0] => 319
[1] => echo
[2] => 2
)
[2] => Array
(
[0] => 379
[1] =>
[2] => 2
)
[3] => Array
(
[0] => 318
[1] => 'hello world'
[2] => 2
)
[4] => Array
(
[0] => 310
[1] => l
[2] => 2
)
[5] => ;
[6] => Array
(
[0] => 379
[1] =>
[2] => 2
)
[7] => =
[8] => Array
(
[0] => 379
[1] =>
[2] => 3
)
[9] => Array
(
[0] => 308
[1] => 1
[2] => 3
)
[10] => +
[11] => Array
(
[0] => 308
[1] => 1
[2] => 3
)
[12] => ;
[13] => Array
(
[0] => 379
[1] =>
[2] => 3
)
[14] => Array
(
[0] => 319
[1] => echo
[2] => 4
)
[15] => Array
(
[0] => 379
[1] =>
[2] => 4
)
[16] => ;
)
The above information can be observed three
- Token id such as spaces carriage returns are 379
- token strings
- Line number
Token id is an internal code corresponding to Zend token, as defined inzend_language_parser.h
Improve the efficiency of PHP
- Compressed code, removing unwanted comments and whitespace characters (jquery.min.js)
- Try to use the PHP built-in functions or extended functions
- With apc / xcache / opcache such as the PHP opcode cache
- Cache complex and time-consuming operation result
- Can not synchronize asynchronous processing, such as sending e-mail
- HHVM why fast
Convert virtual machine (similar to java) directly into the PHP binary byte code to run, do not always go to resolve when executed.
PHP underlying variable data structure
Use zval storage structure, the following code Zend/zend.h
is defined
typedef union _zvalue_value
{
/* 下面定义描述了PHP的8大数据类型 */
long lval; // 长整型 布尔型
double dval; // 浮点型
struct { // 字符串型
char *val;
int len; // strlen 返回这个值
} str; // NULL 类型表示本身为空
HashTable *ht; // 数组使用哈希表实现
zend_object_value obj; // 对象类型
} zvalue_value;
struct _zval_struct
{
zvalue_value value; /* 变量的值 */
zend_uint refcount__gc;
zend_uchar type; /* 变量的类型 */
zend_uchar is_ref__gc
};
typedef struct _zval_struct zval;
Variable type definitions, the following code Zend/zend_types.h
is defined
typedef unsigned int zend_uint;
typedef unsigned char zend_uchar;
PHP eight major types of data through unified zvalue_value
Commonwealth of storage
联合体自身为空 描述 null
long 描述 int bool
double 描述 float
str 描述 string
HashTable 描述 数字数组和关联数组
zend_object_value 描述 对象和资源
PHP variable types describe the use zend_uchar type
Description
#define IS_NULL 0
#define IS_LONG 1
#define IS_DOUBLE 2
#define IS_BOOL 3
#define IS_ARRAY 4
#define IS_OBJECT 5
#define IS_STRING 6
#define IS_RESOURCE 7
#define IS_CONSTANT 8
#define IS_CONSTANT_ARRAY 9
For example, $a=3
the structure follows (pseudocode)
struct {
zvalue_value = 3;
refcount__gc = 1;
type = IS_LONG;
is_ref__gc = 0;
}
$a
Like a pointer to a structure above
COW features of PHP assignment by value in
In the _zval_struct
data structure as well as the following two members
- zend_uint refcount__gc represents cited many times, each time reference +1
- zend_uchar is_ref__gc represent ordinary variable or reference variable
The following quote by writing code to understand the mechanism
Here I am using php5.4, you need to install to view the variable reference xdebug
Note the use of php7.2 test when the number of references would have been 0
Compiled xdebug.so
yum -y install php-devel
tar xf xdebug-2.8.0alpha1.tgz
cd xdebug-2.8.0alpha1
phpize
find /usr/ -name "php-config"
./configure --with-php-config=/usr/bin/php-config
make && make install
ls /usr/lib64/php/modules/
Configuration xdebug
php --ini
echo 'zend_extension=/usr/lib64/php/modules/xdebug.so' >> /etc/php.ini
systemctl restart php72-php-fpm.service
php -m | grep xdebug
Write test code
$a = 3;
xdebug_debug_zval('a');
Export
a: (refcount=1, is_ref=0)=3
refcount
Reference number 1is_ref
0 indicates normal variable=3
3 represents the value
Start quote
$a = 3;
$b = $a;
xdebug_debug_zval('a');
xdebug_debug_zval('b');
Export
a: (refcount=2, is_ref=0)=3
b: (refcount=2, is_ref=0)=3
Given a new value
$a = 3;
$b = $a;
$b = 5;
xdebug_debug_zval('a');
xdebug_debug_zval('b');
Export
a: (refcount=1, is_ref=0)=3
b: (refcount=1, is_ref=0)=5
Delivery Address
$a = 3;
$b = &$a;
xdebug_debug_zval('a');
xdebug_debug_zval('b');
Export
a: (refcount=2, is_ref=1)=3
b: (refcount=2, is_ref=1)=3
is_ref transferred from the variable to a common variable reference variable
Given a new value
$a = 3;
$b = &$a;
$c = $a;
$b = 5;
xdebug_debug_zval('a');
xdebug_debug_zval('b');
xdebug_debug_zval('c');
a: (refcount=2, is_ref=1)=5
b: (refcount=2, is_ref=1)=5
c: (refcount=1, is_ref=0)=3
to sum up
The value of the variable is passed by reference between assignment form, without the need to open up new space, save resources
When the value of a variable is changed, it will copy to save the new value, dereference, known as copy on write (COW)
Reference variable does not trigger COW
PHP garbage collection mechanism
What is garbage
Shanghai people: What rubbish you count?
If a zval no variable references it, then it is rubbish
?: (refcount=0, is_ref=0)=5
Why would clean up the garbage?
Some people say that would destroy all php variables at the end of the thread, close all handles resource, not an automatic thing, why should clean up
- If php short handle multiple large files (such as movies 1G), processed without recycling continues with the next one will be a memory overflow
- If the php script is a daemon or long-running, no garbage, slowly accumulated will be a memory overflow
How to clean up garbage
- Looking for garbage
- Remove
- Looking for garbage
By get_defined_vars
viewing all defined variables
Underlying code zend_globals.h
defines two hash tables to store all variables
struct _zend_executor_globals {
...
HashTable *active_symbol_table; //局部变量符号表
HashTable symbol_table; //全局变量符号表
...
}
After finding all defined variables, find out what variable reference number 0
struct _zval_struct{
...
zend_uint refcount__gc;
zend_uchar is_ref__gc;
...
}
- Rubbish
As above, the variable 0 refcount__gc clearance, the idea is to practice before PHP5.2 version
After PHP5.3 use 引用计数系统中同步周期回收
algorithms to remove
In fact, the new algorithm is based on refcount__gc to recover, so why use a new algorithm for it?
We know refcount__gc 0 must be garbage
But not all are rubbish refcount__gc 0
Refcount__gc also not spam, the following experiment may produce 0 0 spam
one example
$a = ['a'];
$a[] = &$a; //引用自己
xdebug_debug_zval('a');
Export
a: (refcount=2, is_ref=1)=array (
0 => (refcount=1, is_ref=0)='a',
1 => (refcount=2, is_ref=1)=...
)
Second element: the representative recursive ..., reference number 2, is a pointer reference variable
Photo provided by the official
At this point delete $ a
$a = ['a'];
$a[] = &$a;
unset($a);
xdebug_debug_zval('a');
Output
a: no such symbol
Since $ a is deleted, so xdebug print does not come out, so this time the theory is structured as follows
(refcount=1, is_ref=1)=array (
0 => (refcount=1, is_ref=0)='a',
1 => (refcount=1, is_ref=1)=...
)
At this point this zval has no symbol (symbol) cited, but because of its own quote myself refcount is 1, so it is a wonderful work of garbage
For the end of this case php script, automatically cleared when the end will take up space
So garbage ideas versions prior to 5.2 can not cover this case
Reference counting system synchronization recovery cycle algorithm (Concurrent Cycle Collection in Reference Counted System)
Continue with the above code as an example
The new algorithm description:
The $a
as suspected junk variable, to simulate the deletion (refcount--), and then simulate the recovery, restoration conditions there are other variables referenced simulated recovery (refcount ++) if the value
Such success is not able to restore the garbage, it can be deleted.
Garbage wonderful example above:
(refcount=1, is_ref=1)=array (
0 => (refcount=1, is_ref=0)='a',
1 => (refcount=1, is_ref=1)=...
)
After deleting simulation becomes:
(refcount=0, is_ref=1)=array (
0 => (refcount=0, is_ref=0)='a',
1 => (refcount=0, is_ref=1)=...
)
Then simulate recovery:
Because there is no such symbol $ a similar point to take the zval, it is not restored to
When Clear
Through the above algorithm could be spam will be stored in an area (RCPs), only full of garbage will be cleared immediately. Note that the premise is open garbage collection
Open garbage collection in two ways
Under the php.ini
zend.enable_gc = On
enabled by defaultBy
gc_enable()
andgc_disable()
opened or closed garbage collection
It may be used as gc_collect_cycles()
a function of the recovery period enforced
Finally, said so much, in fact, only need to understand the principles involved, the whole process does not require PHP developers to participate, just call gc_enable () or gc_collect_cycles () can be automatically recovered
PHP array in the underlying analysis
To review the array of properties
The key features of PHP array
$arr = [
1 => 'a',
'1' => 'b',
1.5 => 'c',
true => 'd',
];
print_r($arr);
Array
(
[1] => d
)
key can be either integer or string
value can be any type
key has the following features
- Numeric string will be converted to an integer '1' => 1
- Floating-point and integer boolean converted into 1.3 = "1
- will be treated as an empty string null null => ''
- Key names can not use objects and arrays
- The front cover behind the same key name
Access array elements
- $arr[key]
- $arr{key}
After the 5.4 version can be used as follows
function getArr(){ return [1,2,3,4]; }
echo getArr()[2];
Delete array element
$a = [1,2,3,4];
foreach ($a as $k => $v) {
unset($a[$k]);
}
$a[] = 5;
print_r($a);
Array
(
[4] => 5
)
- Delete does not reset the index
Array traversal
- for
- foreach
- array_walk
- array_map
- current 和 next
Achieve internal array
Implementation uses two structures HashTable
andbucket
- What is HashTable
Hash tables, data structures access memory storage location directly by keyword.
Keyword is calculated by the hash function, to obtain the mapping table such that the position of: search, insert, modify, delete, are completed in O (1)
The following code Zend/zend_types.h
typedef struct _zend_array HashTable;
struct _zend_array {
zend_refcounted_h gc;
union {
struct {
ZEND_ENDIAN_LOHI_4(
zend_uchar flags,
zend_uchar nApplyCount,
zend_uchar nIteratorsCount,
zend_uchar consistency)
} v;
uint32_t flags;
} u;
uint32_t nTableMask;
Bucket *arData;
uint32_t nNumUsed;
uint32_t nNumOfElements;
uint32_t nTableSize;
uint32_t nInternalPointer;
zend_long nNextFreeElement;
dtor_func_t pDestructor;
};
Older structures
typedef struct _hashtable {
uint nTableSize;
uint nTableMask;
uint nNumOfElements;
ulong nNextFreeElement;
Bucket *pInternalPointer;
Bucket *pListHead;
Bucket *pListTail;
Bucket **arBuckets;
unsigned char nApplyCount;
};
member | Explanation |
---|---|
nTableSize | Bucket size, a minimum of 8 to 2x increase |
nTableMask | Index Tuning nTableSize-1 |
nNumOfElements | The number of elements using the count () function returns the direct |
nNextFreeElement | The next index position using foreach |
pInternalPointer | The current iteration pointer, foreach faster than for the reason, reset current function uses |
pListHead | 存储数组头部指针 |
pListTail | 存储数组尾部指针 |
arBuckets | 实际存储容器 |
arData | Bucket数据 |
nApplyCount | 记录被递归次数,防止死循环递归 |
typedef struct bucket
{
ulong h;
uint nKeyLength;
void *pData;
void *pDataPtr;
struct bucket *pListNext;
struct bucket *pListLast;
struct bucket *pNext;
struct bucket *pLast;
const char *arKey;
};
成员 | 说明 |
---|---|
h | 对char *key进行hash后的值,或是用户指定数字索引值 |
nKeyLength | 哈希关键字长度,若为索引数字则为0 |
pData | 指向value 一般是用户数据的副本,若为指针数据则指向指针 |
pDataPtr | 如果是指针数据,指针会指向真正value,上面指向此 |
pListNext | 整个hash表下个元素 |
pListLast | 整个hash表上个元素 |
pNext | 同一个hash的下一个元素 |
pLast | 同一个hash的上一个元素 |
arKey | 保存当前key对应的字符串 |
foreach 遍历先从 HashTable 的 pListHead -> pListNext
pNext
和 pLast
用于hash冲突同一个hash不同个bucket之间指针
PHP数组函数分类
建议体验一下下面的函数,不用记住,只是留个印象,当你需要用的时候会联想起来的,而不用自己去实现
遍历
- prev
- next
- current
- end
- reset
- each
排序
- sort
- rsort
- asort
- ksort
- krsort
- uasort
- uksort
查找
- in_array
- array_search
- array_key_exists
分合
- array_slice
- array_splice
- implode
- explode
- array_combine
- array_chunk
- array_keys
- array_values
- array_columns
集合
- array_merge
- array_diff
- array_diff_*
- array_intersect
- array_intersect_*
队列/栈
- array_push
- array_pop
- array_shift
其他
- array_fill
- array_flip
- array_sum
- array_reverse
Reproduced, please indicate the source https://www.cnblogs.com/demonxian3/p/11327522.html