PHP source code - intval function source code analysis

PHP source code - intval function source code analysis

PHP in intval

intval ( mixed $var [, int $base = 10 ] ) : int
  • Its role is to convert the variable to an integer value. Its second argument $baseused is not a lot. It represents binary conversion used. The default is 10 decimal
  • By the following simple example to understand how to use it:
$var1 = '123';
$var2 = '-123';
$var3 = [1, 2, ];
$var4 = [-1, 2, ];
var_dump(
    intval($var1),
    intval($var2),
    intval($var3),
    intval($var4)
);
// 输出如下:
// int(-123)
// int(1)
// int(1)
  • This function is not selected from 100 out of the function, but occasionally in LeetCode brush title , hit the idea to convert the string to a number of questions algorithms obtained, PHP has intval, how it is implemented in the bottom of it?

intval achieving source

  • Function intval located php-7.3.3/ext/standard/type.c, you can click to see
  • Function much source code directly posted:
PHP_FUNCTION(intval)
{
    zval *num;
    zend_long base = 10;

    ZEND_PARSE_PARAMETERS_START(1, 2)
        Z_PARAM_ZVAL(num)
        Z_PARAM_OPTIONAL
        Z_PARAM_LONG(base)
    ZEND_PARSE_PARAMETERS_END();

    if (Z_TYPE_P(num) != IS_STRING || base == 10) {
        RETVAL_LONG(zval_get_long(num));
        return;
    }


    if (base == 0 || base == 2) {
        char *strval = Z_STRVAL_P(num);
        size_t strlen = Z_STRLEN_P(num);

        while (isspace(*strval) && strlen) {
            strval++;
            strlen--;
        }

        /* Length of 3+ covers "0b#" and "-0b" (which results in 0) */
        if (strlen > 2) {
            int offset = 0;
            if (strval[0] == '-' || strval[0] == '+') {
                offset = 1;
            }

            if (strval[offset] == '0' && (strval[offset + 1] == 'b' || strval[offset + 1] == 'B')) {
                char *tmpval;
                strlen -= 2; /* Removing "0b" */
                tmpval = emalloc(strlen + 1);

                /* Place the unary symbol at pos 0 if there was one */
                if (offset) {
                    tmpval[0] = strval[0];
                }

                /* Copy the data from after "0b" to the end of the buffer */
                memcpy(tmpval + offset, strval + offset + 2, strlen - offset);
                tmpval[strlen] = 0;

                RETVAL_LONG(ZEND_STRTOL(tmpval, NULL, 2));
                efree(tmpval);
                return;
            }
        }
    }

    RETVAL_LONG(ZEND_STRTOL(Z_STRVAL_P(num), NULL, base));
}
  • From the perspective of the user state see PHP, the intval function prototype, the input parameter $varvariable type mixed, which means, the input parameters may be any type of PHP, including plastic, strings, arrays, and other objects. Thus, using the received input parameters zval directly in the sourcezval *num;

Decimal case

  • Source, most of the content is for the treatment of non-decimal. Let's look at the focus of the case 10 decimal. When data is converted to an integer of 10 decimal, the source processing done as follows:
if (Z_TYPE_P(num) != IS_STRING || base == 10) {
    RETVAL_LONG(zval_get_long(num));
    return;
}

static zend_always_inline zend_long zval_get_long(zval *op) {
    return EXPECTED(Z_TYPE_P(op) == IS_LONG) ? Z_LVAL_P(op) : zval_get_long_func(op);
}

ZEND_API zend_long ZEND_FASTCALL zval_get_long_func(zval *op) /* {{{ */
{
    return _zval_get_long_func_ex(op, 1);
}
  • As long as the incoming data is not an integer case, then the source code eventually calls _zval_get_long_func_ex(op, 1);. In this function, the case where the processing of various types of parameters PHP user mode:
switch (Z_TYPE_P(op)) {
    case IS_UNDEF:
    case IS_NULL:
    case IS_FALSE:
        return 0;
    case IS_TRUE:
        return 1;
    case IS_RESOURCE:
        return Z_RES_HANDLE_P(op);
    case IS_LONG:
        return Z_LVAL_P(op);
    case IS_DOUBLE:
        return zend_dval_to_lval(Z_DVAL_P(op));
    case IS_STRING:
        // 略 ……
    case IS_ARRAY:
        return zend_hash_num_elements(Z_ARRVAL_P(op)) ? 1 : 0;
    case IS_OBJECT:
        // 略 ……
    case IS_REFERENCE:
        op = Z_REFVAL_P(op);
        goto try_again;
    EMPTY_SWITCH_DEFAULT_CASE()
}
  • By different branches of the switch statement to do a variety of different treatments for different types:
    • If the incoming type is "null", then the function to return directly intval 0;
    • If true, returns 1
    • Returns zero if the array is an empty array; non-empty array Returns 1
    • If a string, it is further processed
    • ……
  • According to the original intention of this article is to look at how the string into integer data, so we focus on a character string to see:
{
    zend_uchar type;
    zend_long lval;
    double dval;
    if (0 == (type = is_numeric_string(Z_STRVAL_P(op), Z_STRLEN_P(op), &lval, &dval, silent ? 1 : -1))) {
        if (!silent) {
            zend_error(E_WARNING, "A non-numeric value encountered");
        }
        return 0;
    } else if (EXPECTED(type == IS_LONG)) {
        return lval;
    } else {
        /* Previously we used strtol here, not is_numeric_string,
         * and strtol gives you LONG_MAX/_MIN on overflow.
         * We use use saturating conversion to emulate strtol()'s
         * behaviour.
         */
         return zend_dval_to_lval_cap(dval);
    }
}
static zend_always_inline zend_uchar is_numeric_string(const char *str, size_t length, zend_long *lval, double *dval, int allow_errors) {
    return is_numeric_string_ex(str, length, lval, dval, allow_errors, NULL);
}

static zend_always_inline zend_uchar is_numeric_string_ex(const char *str, size_t length, zend_long *lval, double *dval, int allow_errors, int *oflow_info)
{
    if (*str > '9') {
        return 0;
    }
    return _is_numeric_string_ex(str, length, lval, dval, allow_errors, oflow_info);
}

ZEND_API zend_uchar ZEND_FASTCALL _is_numeric_string_ex(const char *str, size_t length, zend_long *lval, double *dval, int allow_errors, int *oflow_info) { // ... }
  • In this logic, the best embodies the string turn shaping algorithm is still hidden is_numeric_string(Z_STRVAL_P(op), Z_STRLEN_P(op), &lval, &dval, silent ? 1 : -1)behind the function call, which is a function of_is_numeric_string_ex
  • For a string, it is converted to shaping our rules are generally as follows:
    • Removing preceding space characters, including spaces, line breaks, tabs, etc.
    • Properly handle the preceding string +/-symbol
    • Processing front of '0'characters, such as string '001a'after conversion to shaping, is 1, in addition to the foregoing '0'character
    • String processing in the first few remaining value is numeric string and discard non-numeric characters. The so-called digital character, that '0'-'9'character

Blank sign deal

  • Processing source code is as follows:
while (*str == ' ' || *str == '\t' || *str == '\n' || *str == '\r' || *str == '\v' || *str == '\f') {
    str++;
    length--;
}
  • \n, \t, \rThese use some more. \vIt refers vertical tabs; \fis a page break. In view of this whitespace, not treated, choose to skip. Then use pointer arithmetic str++to point to the next character

Positive, negative sign deal

  • Since the plus or minus sign in the sense of numerical values, and therefore need to be retained, but the numerical values +number may be omitted:
if (*ptr == '-') {
    neg = 1;
    ptr++;
} else if (*ptr == '+') {
    ptr++;
}

Skip any number of characters 0

  • Because the value 0 before the decimal value is meaningless, and therefore needs to be skipped:
while (*ptr == '0') {
    ptr++;
}
  • After processing the above three cases, one by one will be converted to an integer in the characters take over. Since the first character number is traversed at a high level, so that the previous character, the required value prior to the calculation in the *10operation. for example:
    • For strings 231aa, the first to traverse a character '2'when it is stored as a temporary variable tmp the value
    • The second pass '3', it is necessary *10, i.e. tmp * 10 + 3, when the value of tmp 23
    • The third pass '1', the need tmp * 10 + 1, at this time the value of tmp 231.
  • Thus, the source character is determined numeric characters: ZEND_IS_DIGIT(*ptr), is then calculated according to the above-described manner

    • ZEND_IS_DIGIT achieve macro is ((c) >= '0' && (c) <= '9')located '0'and '9'characters between what we need to find numeric characters.

Situation decimal

  • _is_numeric_string_exPHP function is called multiple functions at the bottom will include floatval. If you traverse the string of characters encountered decimal point how to handle it? Personal point of view, because we want to achieve is intvala function, so I think the face of the decimal point, it can be treated as non-numeric characters to deal with. For example, "3.14abc"directly after a string 3, intval. In practice, however, _is_numeric_string_exthe implementation is not the case, because it is a generic function. In the face of the decimal point, there are some special treatment:
  • In the case encountered a decimal point, c will be goto jump, jump to process_double:
process_double:
    type = IS_DOUBLE;

    /* If there's a dval, do the conversion; else continue checking
     * the digits if we need to check for a full match */
    if (dval) {
        local_dval = zend_strtod(str, &ptr);
    } else if (allow_errors != 1 && dp_or_e != -1) {
        dp_or_e = (*ptr++ == '.') ? 1 : 2;
        goto check_digits;
    }
  • _is_numeric_string_ex The last function will get the float returns:
if (dval) {
    *dval = local_dval;
}

return IS_DOUBLE;
  • Floating-point number is assigned to dvalthe pointer. Identification and data IS_DOUBLEreturned.
  • Then jump back to the execution stack function _zval_get_long_func_exto continue, that is return zend_dval_to_lval_cap(dval);. This function is defined as follows:
static zend_always_inline zend_long zend_dval_to_lval_cap(double d)
{
    if (UNEXPECTED(!zend_finite(d)) || UNEXPECTED(zend_isnan(d))) {
        return 0;
    } else if (!ZEND_DOUBLE_FITS_LONG(d)) {
        return (d > 0 ? ZEND_LONG_MAX : ZEND_LONG_MIN);
    }
    return (zend_long)d;
}
  • That is, the floating point to integer result type of the underlying were cast of: (zend_long)d.

Epilogue

  • PHP logic underlying the many small pieces of the package, improved large degree of code reuse. But it also brings some extra costs to maintain and learning source. A type conversion function on carrying out more than 10 kinds of function calls.
  • Next, will be extended intval underlying practice related. Stay tuned.
  • If you have a better idea, I welcome comments and suggestions.

Guess you like

Origin www.cnblogs.com/ishenghuo/p/11803114.html