The principle and performance of PHP function analysis

Foreword

In any language, the most basic functions are composed of cells. For php function, it has what features? Function calls is how to achieve? How performance php function, what recommendations? This article from the principles set out to analyze the actual performance tests attempt to answer these questions, written in php better understanding of the program at the same time to achieve. But also some common php function are introduced.

Classification php function

In php, the transverse division, the function is divided into two categories: user function (built-in functions) and internal function (built-in function). The former method is that some functions and user-defined in the program, the latter is php itself provide a range of library functions (such as sprintf, array_push, etc.). Users can also write a library function by means of extension, this will be described later. For user function, can be subdivided into function (function) and method (class methods), these three functions in the paper test and analysis, respectively.

Php function to achieve

A php function ultimately how to perform this process is kind of how it?

To answer this question, let's look execute php code through which the flow.

You can see from the chart, php achieve a typical dynamic process execution languages: get the piece of code, through lexical parsing after parsing phases, will be translated into a source instructions (opcodes), then the virtual ZEND machine sequentially executes these instructions to complete the operation. C Php itself is implemented, and therefore are final call c function, in fact, we can put php seen as a software development c.
Through the above description is not difficult to see that the implementation of the function of php also been translated into opcodes to call, each function call is actually performed one or more instructions.

For each function, zend are described by the following data structure

typedef union _zend_function {
    zend_uchar type;    /* MUST be the first element of this struct! */
    struct {
        zend_uchar type;  /* never used */
        char *function_name;
        zend_class_entry *scope;
        zend_uint fn_flags;
        union _zend_function *prototype;
        zend_uint num_args;
        zend_uint required_num_args;
        zend_arg_info *arg_info;
        zend_bool pass_rest_by_reference;
        unsigned char return_reference;
    } common;

    zend_op_array op_array;
    zend_internal_function internal_function;
} zend_function;


typedef struct _zend_function_state {
    HashTable *function_symbol_table;
    zend_function *function;
    void *reserved[ZEND_MAX_RESERVED_RESOURCES];
} zend_function_state;
Which type identifies the type of the function: user function, built-in functions, overloaded functions. Common contains basic information functions, including the function name, the parameter information, the function flag (an ordinary function, static methods, abstract method) and so on. Further, functions for the user, and a function symbol table, records the internal variables and the like, this will be described later. Zend maintains a global function_table, this is a big hahs table. When the function call will first find the corresponding zend_function from the table, according to the function name. When a function call when the virtual machine depending on the type of decision to call a method, different types of functions, its implementation principle is not the same .

Built-in functions

Built-in functions, which in essence is the real function of c, each built-in function, php after the final compilation will be expanded to become a function named zif_xxxx, such as our common sprintf, corresponds to the bottom is zif_sprintf. Zend in the course of implementation, if found to be built-in function, then simply do a forwarding operations.

Zend provides for a series of api calls, including acquisition parameters, array operations, memory allocation. Built-in function of parameters of acquisition, zend_parse_parameters achieved by a method for arrays, strings and other parameters, Zend shallow copy is achieved, and therefore the efficiency is very high. It can be said, for the php built-in functions, efficiency and c corresponding function almost the same, the only one more call forwarding.

Built-in functions in php are dynamically loaded so by the way, users can also write their own appropriate so necessary, that is, we often say that expansion. ZEND offers a range of api for Expansion

User Functions

And a built-in function in comparison with a completely different users and processes performed by the custom implementation principle function implemented php. As before, we know php code is translated into a bar opcode to be executed, the user function is no exception, the actual function corresponding to each opcode into a set, the set of instructions are stored in the zend_function. Thus, the user calls the function corresponding to the final is to perform a set of opcodes.

  • Preservation of local variables and recursive implementation 
    we know that recursive function is accomplished through the stack. In php, is achieved using a similar procedure. Zend assigned an activity symbol table (active_sym_table) for each php function, the current function state records all local variables. All of the symbol table to maintain the form of a stack, whenever there is a function call, assign a new symbol table and stack. After the end of the current symbol Expressed call stack. Thereby achieving the preservation and recursive state.

For the maintenance of the stack, zend here optimized. Pre-assigned a static array of length N to simulate the stack, this approach to simulate the dynamic data structures in our own programs often have to use static arrays, each call avoided in this way brings memory allocation, destroy. ZEND just at the end of the current function call stack symbol table data can clean out.
Because static array of length N, once the function call hierarchy than N, the program will not stack overflow, zend will be assigned the symbol table of this case, destruction, thus causing a lot of performance decline. Inside the zend, N is the current value of 32. Therefore, when we write php program, a function call level best not more than 32. Of course, if the web application itself can function call hierarchy depth.

  • Transfer parameters 
    and the built-in function calls zend_parse_params to obtain different parameters, obtaining user parameters function is accomplished by an instruction. Function has a few parameters corresponding to a few instructions. Specific to the realization that ordinary variable.
    As it can be seen by the above analysis, and compared to built-in functions, since to maintain their own stack table, each instruction is executed and a function c, the relative performance of the user function will be a lot worse, there behind specific comparative analysis. Therefore, if a php function has a corresponding built-in functions to achieve as much as possible not to re-write their own functions to achieve.

Class Methods

Class methods of its implementation of the principles and user functions are the same, and it is translated into opcodes sequentially calls. Implementation class, a data structure with Zend zend_class_entry achieved, which holds some basic information associated with the class. This entry is in php when the compiler has been processed.

In the zend_function common, there is a known member of the scope, which is currently pointing to a corresponding method zend_class_entry class. About object-oriented implementation php, not here in more detail introduced in the future will be devoted to write an article that detailed the principles of object-oriented implementation in php. It is a function of this piece is, method principle and function exactly the same, in theory, its performance is similar, later we will do a detailed performance comparison.

Performance Comparison

The impact on performance of the function name length

  • Test Method 
    for 1,2,4,8,16 name length as a function of the comparison, comparing their second executable test frequency to determine the impact on the performance of the function name length
  • Test results as shown below

 

  • Analysis results 
    can be seen from the figure, the length of the performance and function names will have effect. A length as a function of length 1 to 16 and the  blank function call  , the performance difference between the 1-fold. Analyze the source code is not difficult to find reasons, as previously described earlier, when the function call will first zend relevant information through the function name in a global funtion_table in, function_table is a hash table. Inevitable, the name of the longer time required to query the more. Therefore, when the actual programming, a function called multiple times, the name is too long it is not recommended

Although the length of the function name has a certain impact on performance, but the exact extent of it? This issue should still need to consider the actual situation, if a function itself is more complex, then affect the overall performance is not significant.
One suggestion is for those who will call many times, the function itself and relatively simple function that can properly take some of the concise name.

Effects on the Properties of the number of functions

  • Test method 
    for testing the function call, the analysis results in the following three environments: a program contains only one function 2. Function 3 100 contains program comprising program 1000 functions.
    The number of tests that can be a function call per second three cases
  • Test results as shown below

 

  • Analysis results 
    can be seen from the test results, almost the same performance all three cases, the number of functions increases performance degradation is minimal and can be ignored.
    From the realization of the principle analysis, the only difference is that part of the function under several implementations acquired. As previously described, all functions in a hash table, the number of different search efficiency should still close to O (1), so little performance difference.

Consumption of different types of function calls

  • Test method 
    selected user functions, class methods, static methods, one each for the built-in function, the function itself does not do anything directly back, consume empty main test function call. Test results for the number of times per second executable 
    test in order to remove other effects, all the function names the same length
  • Test results as shown below

 

  • Results of the analysis 
    can be seen through the test results for the php function prepared by the user, no matter what type, and its efficiency is about the same, both in 280w / s or so. As we expected, even with air conditioning, built-in functions that have much higher efficiency, reach 780w / s, the former is three-fold. Visible, built-in function call overhead is much lower than the user function. From the foregoing analysis shows that the principle main gap wherein the function call when the user initializes the symbol table, the receiving operation parameters.

Built-in functions and user functions performance comparison

  • Test Method 
    built performance comparison functions and user functions, where we choose a few common functions and achieve the same functionality with php about the function performance comparison.
    Test, we selected string, mathematics, a typical array for each comparison, these functions are taken string (substr), 10 transfer binary decimal (decbin), for the minimum (min) and a return array so key (array_keys).
  • Test results as shown below 

 

 

  • The results of the analysis 
    can be seen from the test results, as we expected, built-in function is much higher than the average user functions on overall performance. Especially for functions related to the operation of the string class, the gap reached a magnitude. Therefore, if a principle is a function of the corresponding function uses built-in functions, try to use it instead of writing your own php function.
    For some operations involving large string function, improve performance, consider using extensions to implement. Such as the common rich text filtering.

C function and performance comparison

  • Test Methods 
    We selected string manipulation and arithmetic operations than three kinds of functions, php achieved with the extension. Three kinds of function is a simple arithmetic operations, string comparison and multiple arithmetic operations.
    In addition to the two types of function itself, but also to test the performance of the function of removing the overhead air-conditioning, on the one hand about the performance than the difference between the two functions (c and php built) itself, and the other side is confirmed consumption of air conditioning function 
    test point time to perform operations 10w consumption
  • Test results as shown below

 

  • Analysis of the results of 
    built-in functions and C functions overhead after removing the impact of air conditioning use php function smaller gap, as a function of more complex functions, the two sides close to the same performance. Before this function can be easily achieved from the analysis been demonstrated, after all, built-in functions that the C implementation.
    Function more complex functions, the performance gap c and php smaller 
    relatively speaking c, php function call overhead is much larger, for it is simply a function of the performance still has some influence. Thus not function in php deep nested package.

Pseudo function and its properties

In php, there is a number of functions, which are standard in the use of function usage, but the underlying implementation was completely different and the real function call, these functions do not belong to any of a Three function previously mentioned in its essence is a single opcode, referred to herein and the estimated pseudo function or command function.

As mentioned above, the standard pseudo function and use no different function, it appears to have the same characteristics. However, when they are executed last zend reflected instruction (opcode) became one corresponding to the calls, so it is closer to achieve if, for, arithmetic and other operations.

  • pseudo function in php 
    isset 
    empty 
    the unset 
    the eval

As it can be seen by the above description, the pseudo-function due to being directly translated into instructions to perform, and less overhead compared to a normal function function call brings, so the performance will be better. We do a comparison by the following test. Both Array_key_exists and isset array can determine whether there is a key, look at their performance

 

As it can be seen from the figure, and compared array_key_exists, isset much higher performance, substantially about 4 times the former, and even compared null function call, comparing its performance should be about 1-fold. This also confirms once again shows the side php function call overhead is relatively large.

Implementation and introduce common php function

count

count is a function we often use, its function is to return an array of length.

count this function, its complexity is how much? 
A common count argument is the function loops through the array and the number of elements is obtained, and therefore the complexity is O (n). That reality is not so?
We look back to realize the count can be found through the source code for the count operation of the array, the final path function is zif_count-> php_count_recursive-> zend_hash_num_elements, and zend_hash_num_elements behavior is return ht-> nNumOfElements, see, this is a O (1) instead of O (n) operations. Indeed, the underlying array is a php hash_table, for the hash table, there is a special element in Zend nNumOfElements records the number of the current element, and therefore the average count for substantially directly returns this value. Thus, we conclude that:  COUNT is independent of the size O (1) complexity, and specific arrays.

Non-array type of variable, how to count the time behavior?
For the return variable is not set to 0, and like int, double, string and the like will return 1

strlen

Strlen used to return a string length. So, how is his realization of the principle of it?
We all know the strlen c is a o (n) function, the sequence will traverse the string until it / 0, then the length. Php whether it be this way? The answer is no, php string in a composite structure is described, including a pointer pointing to the specific data string length and (c ++ and similar to the string), and therefore directly returned strlen string length, the operation is a constant level .
Further, for non-string type variable calls strlen, it is first cast the variable length character string and then request, that needs attention.

isset和array_key_exists

The two most common usage of the function is to determine whether there is a key in the array. But the former may also be configured to determine whether a variable is set too.
As mentioned earlier, it is not really a function isset, so its efficiency is much higher than the latter. Recommended use it instead array_key_exists.

array_push和array[]

Both are appended to the end of the array element. The difference is that the former can push once more. Their biggest difference is that one is a function of language structure, which efficiency is higher. So if just ordinary additional elements, it is recommended to use array [].

rand and mt_rand

Both functions are provided for generating random numbers, using the former standard libc rand. The latter use the known characteristics of Mersenne Twister random number generator which can generate random value rand average speed ratio provided by the libc () four times faster. So if high performance requirements, it can be considered in place of the former with mt_rand.
As we all know, rand is generated pseudo-random number, need to use srand in C to display the specified seed. But in php, rand will help you own a default call srand, no need to call them again displayed under normal circumstances.
Note that, if you need to call srand exceptional cases, be sure to complete calls. That srand for the rand, mt_srand correspond srand, must not be mixed, otherwise it is invalid.

sort and usort

Both are used to sort, except that it can specify the sort strategy, similar to our C and C ++ qsort inside of the sort.
Both are in the sort of standard to achieve fast row, for a sort of demand, such as non-special circumstances to call these methods provide php on it, do not have to re-implement it again, efficiency is much lower. See supra reason for analysis of user functions and built-in functions of comparison.

urlencode和rawurlencode

Both are used to encode url, except -_ than the string. All non-alphanumeric characters will be replaced with a percent (%) followed by two hexadecimal digits. Only difference is that for the space, will be encoded urlencode +, and 20% rawurlencode be encoded.
In addition to search engines, our strategy spaces are encoded as 20% under normal circumstances. Thus the majority of the latter.
Note that encode and decode the series must be supporting the use.

strcmp function Series

This series includes the function strcmp, strncmp, strcasecmp, strncasecmp, to achieve the same functions and C functions. However, there are also different, since the string is allowed php / 0 occurs, is determined so when the bottom is used instead of strcmp memcmp series, in theory faster.
In addition, as can be obtained directly php string length, thus this first aspect of the inspection, the efficiency in many cases will be much higher.

is_int和is_numeric

These two functions are not identical functions similar use of the time they need to pay attention to some of the difference.
Is_int: determining whether a variable type is an integer type, php variable characterizing a field dedicated type, so this type can be directly determined, is an absolute O (1) operations 
Is_numeric: determining whether a variable is an integer or a numeric string , i.e. in addition to an integer variable returns true than, for string variables, if the form "1234", "1e4", etc. will be judged as true. This time will traverse the string judgment.

Summary and recommendations

Through the realization of the principle of function analysis and performance testing, we conclude the following conclusions

1. Php function call overhead is relatively large.

2. Function related information is stored in a large hash_table, look in the hash table by the function name each call, so the function name length also have some impact on performance.

3. Function returns a reference moot

4. Built-in functions php user performance than many functions, particularly for operating string class.

5. Class methods, ordinary, static method efficiency is almost the same, not much difference

6. The impact of removing the empty function call, the same built-in functions and performance features of the C function almost all the same.

7. All parameters are passed shallow copy of the reference count, a small price.

8. The number of functions negligible impact on performance

Therefore, for use php function, following are some suggestions

1. A feature complete with built-in functions, try to use it rather than trying to write php function.

2. If a feature of the high performance requirements, consider using extensions to implement.

3. Php function call overhead is large, so do not over-packaged. Some functions, if the number of times you need to call itself a lot with only 2 lines of code implemented on the line, do not recommend the package called.

4. Do not over-obsessed with various design patterns, as a description, excessive packaging will bring performance degradation. Need to consider the trade-off between the two. Php has its own characteristics, must not be mere copycat, too much java to follow the pattern.

5. Function should not be deeply nested, recursive use caution.

6. High performance pseudo-function, under the same function to achieve priorities. For example instead of using isset array_key_exists

7. Function returns a reference does not make much sense, but there is no practical effect, it is recommended not be considered.

8. Class member method efficiency than an ordinary function is low, so do not worry about performance loss. It recommends that more consideration static methods, and safety are better readability.

9. If it is not special needs, we recommend the use of parameters passed by value rather than pass by reference. Of course, if the parameter is a great need to modify the array and can be considered when passed by reference.

 

 

 

 

Published 80 original articles · won praise 96 · views 360 000 +

Guess you like

Origin blog.csdn.net/Alen_xiaoxin/article/details/104888627