PHP source code - intval function source code analysis
- Source: https://github.com/suhanyujie/learn-computer/
- Author: suhanyujie
- Based on PHP 7.3.3
PHP in intval
- intval function signature is visible from the official document:
intval ( mixed $var [, int $base = 10 ] ) : int
- Its role is to convert the variable to an integer value. Its second argument
$base
used is not a lot. It represents binary conversion used. The default is 10 decimal - By the following simple example to understand how to use it:
$var1 = '123';
$var2 = '-123';
$var3 = [1, 2, ];
$var4 = [-1, 2, ];
var_dump(
intval($var1),
intval($var2),
intval($var3),
intval($var4)
);
// 输出如下:
// int(-123)
// int(1)
// int(1)
- This function is not selected from 100 out of the function, but occasionally in LeetCode brush title , hit the idea to convert the string to a number of questions algorithms obtained, PHP has intval, how it is implemented in the bottom of it?
intval achieving source
- Function intval located
php-7.3.3/ext/standard/type.c
, you can click to see - Function much source code directly posted:
PHP_FUNCTION(intval)
{
zval *num;
zend_long base = 10;
ZEND_PARSE_PARAMETERS_START(1, 2)
Z_PARAM_ZVAL(num)
Z_PARAM_OPTIONAL
Z_PARAM_LONG(base)
ZEND_PARSE_PARAMETERS_END();
if (Z_TYPE_P(num) != IS_STRING || base == 10) {
RETVAL_LONG(zval_get_long(num));
return;
}
if (base == 0 || base == 2) {
char *strval = Z_STRVAL_P(num);
size_t strlen = Z_STRLEN_P(num);
while (isspace(*strval) && strlen) {
strval++;
strlen--;
}
/* Length of 3+ covers "0b#" and "-0b" (which results in 0) */
if (strlen > 2) {
int offset = 0;
if (strval[0] == '-' || strval[0] == '+') {
offset = 1;
}
if (strval[offset] == '0' && (strval[offset + 1] == 'b' || strval[offset + 1] == 'B')) {
char *tmpval;
strlen -= 2; /* Removing "0b" */
tmpval = emalloc(strlen + 1);
/* Place the unary symbol at pos 0 if there was one */
if (offset) {
tmpval[0] = strval[0];
}
/* Copy the data from after "0b" to the end of the buffer */
memcpy(tmpval + offset, strval + offset + 2, strlen - offset);
tmpval[strlen] = 0;
RETVAL_LONG(ZEND_STRTOL(tmpval, NULL, 2));
efree(tmpval);
return;
}
}
}
RETVAL_LONG(ZEND_STRTOL(Z_STRVAL_P(num), NULL, base));
}
- From the perspective of the user state see PHP, the intval function prototype, the input parameter
$var
variable typemixed
, which means, the input parameters may be any type of PHP, including plastic, strings, arrays, and other objects. Thus, using the received input parameters zval directly in the sourcezval *num;
Decimal case
- Source, most of the content is for the treatment of non-decimal. Let's look at the focus of the case 10 decimal. When data is converted to an integer of 10 decimal, the source processing done as follows:
if (Z_TYPE_P(num) != IS_STRING || base == 10) {
RETVAL_LONG(zval_get_long(num));
return;
}
static zend_always_inline zend_long zval_get_long(zval *op) {
return EXPECTED(Z_TYPE_P(op) == IS_LONG) ? Z_LVAL_P(op) : zval_get_long_func(op);
}
ZEND_API zend_long ZEND_FASTCALL zval_get_long_func(zval *op) /* {{{ */
{
return _zval_get_long_func_ex(op, 1);
}
- As long as the incoming data is not an integer case, then the source code eventually calls
_zval_get_long_func_ex(op, 1);
. In this function, the case where the processing of various types of parameters PHP user mode:
switch (Z_TYPE_P(op)) {
case IS_UNDEF:
case IS_NULL:
case IS_FALSE:
return 0;
case IS_TRUE:
return 1;
case IS_RESOURCE:
return Z_RES_HANDLE_P(op);
case IS_LONG:
return Z_LVAL_P(op);
case IS_DOUBLE:
return zend_dval_to_lval(Z_DVAL_P(op));
case IS_STRING:
// 略 ……
case IS_ARRAY:
return zend_hash_num_elements(Z_ARRVAL_P(op)) ? 1 : 0;
case IS_OBJECT:
// 略 ……
case IS_REFERENCE:
op = Z_REFVAL_P(op);
goto try_again;
EMPTY_SWITCH_DEFAULT_CASE()
}
- By different branches of the switch statement to do a variety of different treatments for different types:
- If the incoming type is "null", then the function to return directly intval 0;
- If true, returns 1
- Returns zero if the array is an empty array; non-empty array Returns 1
- If a string, it is further processed
- ……
- According to the original intention of this article is to look at how the string into integer data, so we focus on a character string to see:
{
zend_uchar type;
zend_long lval;
double dval;
if (0 == (type = is_numeric_string(Z_STRVAL_P(op), Z_STRLEN_P(op), &lval, &dval, silent ? 1 : -1))) {
if (!silent) {
zend_error(E_WARNING, "A non-numeric value encountered");
}
return 0;
} else if (EXPECTED(type == IS_LONG)) {
return lval;
} else {
/* Previously we used strtol here, not is_numeric_string,
* and strtol gives you LONG_MAX/_MIN on overflow.
* We use use saturating conversion to emulate strtol()'s
* behaviour.
*/
return zend_dval_to_lval_cap(dval);
}
}
static zend_always_inline zend_uchar is_numeric_string(const char *str, size_t length, zend_long *lval, double *dval, int allow_errors) {
return is_numeric_string_ex(str, length, lval, dval, allow_errors, NULL);
}
static zend_always_inline zend_uchar is_numeric_string_ex(const char *str, size_t length, zend_long *lval, double *dval, int allow_errors, int *oflow_info)
{
if (*str > '9') {
return 0;
}
return _is_numeric_string_ex(str, length, lval, dval, allow_errors, oflow_info);
}
ZEND_API zend_uchar ZEND_FASTCALL _is_numeric_string_ex(const char *str, size_t length, zend_long *lval, double *dval, int allow_errors, int *oflow_info) { // ... }
- In this logic, the best embodies the string turn shaping algorithm is still hidden
is_numeric_string(Z_STRVAL_P(op), Z_STRLEN_P(op), &lval, &dval, silent ? 1 : -1)
behind the function call, which is a function of_is_numeric_string_ex
- For a string, it is converted to shaping our rules are generally as follows:
- Removing preceding space characters, including spaces, line breaks, tabs, etc.
- Properly handle the preceding string
+/-
symbol - Processing front of
'0'
characters, such as string'001a'
after conversion to shaping, is1
, in addition to the foregoing'0'
character - String processing in the first few remaining value is numeric string and discard non-numeric characters. The so-called digital character, that
'0'-'9'
character
Blank sign deal
- Processing source code is as follows:
while (*str == ' ' || *str == '\t' || *str == '\n' || *str == '\r' || *str == '\v' || *str == '\f') {
str++;
length--;
}
\n
,\t
,\r
These use some more.\v
It refers vertical tabs;\f
is a page break. In view of this whitespace, not treated, choose to skip. Then use pointer arithmeticstr++
to point to the next character
Positive, negative sign deal
- Since the plus or minus sign in the sense of numerical values, and therefore need to be retained, but the numerical values
+
number may be omitted:
if (*ptr == '-') {
neg = 1;
ptr++;
} else if (*ptr == '+') {
ptr++;
}
Skip any number of characters 0
- Because the value 0 before the decimal value is meaningless, and therefore needs to be skipped:
while (*ptr == '0') {
ptr++;
}
- After processing the above three cases, one by one will be converted to an integer in the characters take over. Since the first character number is traversed at a high level, so that the previous character, the required value prior to the calculation in the
*10
operation. for example:- For strings
231aa
, the first to traverse a character'2'
when it is stored as a temporary variable tmp the value - The second pass
'3'
, it is necessary*10
, i.e.tmp * 10 + 3
, when the value of tmp 23 - The third pass
'1'
, the needtmp * 10 + 1
, at this time the value of tmp 231.
- For strings
Thus, the source character is determined numeric characters:
ZEND_IS_DIGIT(*ptr)
, is then calculated according to the above-described manner- ZEND_IS_DIGIT achieve macro is
((c) >= '0' && (c) <= '9')
located'0'
and'9'
characters between what we need to find numeric characters.
- ZEND_IS_DIGIT achieve macro is
Situation decimal
_is_numeric_string_ex
PHP function is called multiple functions at the bottom will includefloatval
. If you traverse the string of characters encountered decimal point how to handle it? Personal point of view, because we want to achieve isintval
a function, so I think the face of the decimal point, it can be treated as non-numeric characters to deal with. For example,"3.14abc"
directly after a string 3, intval. In practice, however,_is_numeric_string_ex
the implementation is not the case, because it is a generic function. In the face of the decimal point, there are some special treatment:- In the case encountered a decimal point, c will be goto jump, jump to
process_double
:
process_double:
type = IS_DOUBLE;
/* If there's a dval, do the conversion; else continue checking
* the digits if we need to check for a full match */
if (dval) {
local_dval = zend_strtod(str, &ptr);
} else if (allow_errors != 1 && dp_or_e != -1) {
dp_or_e = (*ptr++ == '.') ? 1 : 2;
goto check_digits;
}
_is_numeric_string_ex
The last function will get the float returns:
if (dval) {
*dval = local_dval;
}
return IS_DOUBLE;
- Floating-point number is assigned to
dval
the pointer. Identification and dataIS_DOUBLE
returned. - Then jump back to the execution stack function
_zval_get_long_func_ex
to continue, that isreturn zend_dval_to_lval_cap(dval);
. This function is defined as follows:
static zend_always_inline zend_long zend_dval_to_lval_cap(double d)
{
if (UNEXPECTED(!zend_finite(d)) || UNEXPECTED(zend_isnan(d))) {
return 0;
} else if (!ZEND_DOUBLE_FITS_LONG(d)) {
return (d > 0 ? ZEND_LONG_MAX : ZEND_LONG_MIN);
}
return (zend_long)d;
}
- That is, the floating point to integer result type of the underlying were cast of:
(zend_long)d
.
Epilogue
- PHP logic underlying the many small pieces of the package, improved large degree of code reuse. But it also brings some extra costs to maintain and learning source. A type conversion function on carrying out more than 10 kinds of function calls.
- Next, will be extended intval underlying practice related. Stay tuned.
- If you have a better idea, I welcome comments and suggestions.