Discuss the execution mechanism of parseInt and parseFloat in js from the perspective of ES specifications and engines.
parseInt()
In fact, there are still many "pitfalls" with parseFloat()
these two commonly used APIs. Let's sort them out in this article. (This article is more suitable for jser who often deal with numbers or students who are interested in the operation of these two APIs)
(github: https://github.com/MichealWayne , personal blog address: https://blog.michealwayne.cn/ ), for reprinting, please contact [email protected]
Perform inspection
When describing js values in the past ( Q3. Tell me the results of the following digital value conversion
parseInt
), the operating mechanisms of and are mentionedparseFloat
. In fact, these two APIs still have quite a lot of pitfalls, although they are not usually encountered.
First guess the execution of the following and check it yourself:
/* parseInt */
parseInt('123.456.789');
parseInt('+123.456.789');
parseInt('123abc');
parseInt('abc123');
parseInt('1e6');
parseInt(' 1 ');
parseInt('');
parseInt('0');
parseInt('0x');
parseInt('0x11');
parseInt(new String('123'));
parseInt('a', 16);
parseInt(123.456, -1);
parseInt(123.456, 0);
parseInt(123.456, 1);
parseInt(123.456, 2);
parseInt(123.456, 40);
parseInt(123.456, 36);
parseInt(1e6);
parseInt(1n);
parseInt();
parseInt(null);
parseInt(false);
// 有几个“超纲”题
parseInt(0.00000001);
parseInt(123.456, -99999999999999999999999999);
parseInt(123.456, 99999999999999999999999999);
parseInt(123.456, 9999999999999999999999999);
parseInt(9999999999999999);
parseInt('11111111111111111');
parseInt('11111111111111111111');
parseInt(1e21);
parseInt('123aef', 12);
parseInt('123', NaN);
parseInt('0xf', NaN);
parseInt('123', Infinity);
parseInt('111', 2 ** 32 + 2.1);
parseInt(Symbol());
parseInt(parseInt);
const objTest1 = {
};
parseInt(objTest1);
objTest1.toString = () => 123;
parseInt(objTest1);
/* parseFloat */
parseFloat('123.456.789');
parseFloat('123abc.456.789');
parseFloat(' 123abc ');
parseFloat('1e6');
parseFloat('+Infinity');
parseFloat(Infinity);
parseFloat(123.456);
parseFloat(1e6);
parseFloat(0.00000001);
parseFloat(0.1 + 0.2);
parseFloat('0x1a');
parseFloat(1n);
parseFloat();
parseFloat(null);
parseFloat(false);
// 有几个“超纲”题
parseFloat(9999999999999999);
parseFloat('11111111111111111');
parseFloat('11111111111111111111');
parseFloat(Symbol());
parseFloat(parseFloat);
const objTest2 = {
};
parseFloat(objTest2);
objTest2.toString = () => 123;
parseFloat(objTest2);
ECMAScript Specification
No matter what the kernel, browser or NodeJs is, it will follow the main specifications of ECMA. Therefore, to think about the execution results listed above, you can focus on understanding the execution description of the ECMA specification.
parseInt()
grammar:
parseInt(string, radix)
Among the parameters:
-
string
: The value to be parsed. If the argument is not a string, it is converted to a string. Standard input:
-
radix
: Optional , an integer between2
~36
. InformparseInt()
the functionstring
(for example, 11) isradix
the representation of (for example, 2) base. Ifradix
it does not exist, the number displayed in decimalparseInt
will be returned .string
in addition:
parseInt === Number.parseInt; // => true
ECMAScript (6.0) Specification
Briefly translate the execution steps:
- 1. Define a variable
inputString
, which is the string result of input parameterstring
executionToString(string)
. (ToString
It is an internal abstract operation and is not open to the public. Please see the document or appendix at the end of the article for specific implementation); - 2. If an exception occurs during execution, it will be returned (
ReturnIfAbrupt
theReturnIfAbrupt
execution is actually quite complicated and involves the specification terms of the ECMA specification, which will not be described in this article); - 3. Define a variable
S
, which isinputString
a substring created, which consists of the first code unit that is not a blank character and all code units after the code unit, that is, the leading spaces are removed, which means the sameparseInt('123')
effectparseInt(' 123')
. If no such unit is found,S
the empty string (""
); - 4. Define variables
sign
as1
; - 5. If the variable
S
is not empty andS
the first unit is0x002D
(HYPHEN-MINUS
, that is, minus sign), the variablesign
is changed to-1
; - 6. If the variable
S
is not empty andS
the first unit is0x002B
(PLUS SIGN
, that is, plus sign) or0x002D
(HYPHEN-MINUS, that is, minus sign), removeS
the first unit, that is, remove'+'
the /-
sign; - 7. Define the variable
R
asToInt32(radix)
, that is, perform ToInt32 digital conversion on the hex declaration, that is to sayparseInt('123', 8)
,parseInt('123', '8')
the effect is the same as; - 8. If an exception occurs during execution, return (
ReturnIfAbrupt
); - 9. Define variables
stripPrefix
astrue
; - 10. If the variable
R
is not equal0
, then:- If the variable
R
is less than 2 or greater than 36, return directlyNaN
; - If the variable
R
is not equal to 16, the variablestripPrefix
is changed tofalse
;
- If the variable
- 11. If the variable
R
is equal to0
, the variableR
is changed to10
; - 12. If the variable
stripPrefix
value istrue
, then:- If
S
the length of the variable string is not less than 2 and the first two character units are0x
or0X
, then delete these two characters and set the variableR
to16
;
- If
- 13. Set the variable
Z
. If the variableS
contains aR
character unit that is not a variable number,Z
it isS
a substring of and consists of all code units before the first such character unit, which isparseInt('789', 8)
theparseInt('7', 8)
same as the effect; otherwise,Z
it isS
; - 14. If the variable
Z
is empty, return directlyNaN
; - 15. Set the variable
mathInt
to a mathematical integer value expressedZ
in base , using letters and numbers to represent values from 10 to 35 (if it is 10 and contains more than 20 significant digits, it is based on the implementation choice, Section (each significant digit after 20 digits may be replaced by 0) , if is not 2, 4, 8, 10, 16, or 32, then may be an approximation of a dependent implementation of the mathematical integer value represented in base notation by .R
A-Z
a-z
R
Z
R
mathInt
Z
R
- 16. If
mathInt
equal to 0, then:- If
sign
equal to -1, return-0
; - Otherwise return
+0
;
- If
- 17.Set the variable
number
tomathInt
the Number value; - 18.Return
sign * number
It looks a bit complicated, so I drew a flow chart:
parseInt()
Only the leading part of the string can be interpreted as an integer value; it ignores any code units that cannot be interpreted as part of the integer symbol, and gives no indication that any such code units are ignored.
parseFloat()
grammar:
parseFloat(string)
Among the parameters:
string
: The value to be parsed. If the argument is not a string, it is converted to a string. Standard input:
in addition:
parseFloat === Number.parseFloat; // => true
ECMAScript (6.0) Specification
Official website description:
Steps:
-
1. Define a variable
inputString
, which is the string result of input parameterstring
executionToString(string)
. ; -
2. If an exception occurs during execution, return (
ReturnIfAbrupt
); -
3. Define a variable
trimmedString
, which isinputString
a substring created, which consists of the first code unit that is not a blank character and all code units after the code unit, that is, the leading spaces are removed, which means the sameparseFloat('123.456')
effectparseFloat(' 123.456')
. If no such unit is found, the edgetrimmedString
is an empty string (""
); -
4. If any prefix of
trimmedString
or does not satisfy the syntax, return ; syntax:trimmedString
StrDecimalLiteral
NaN
StrDecimalLiteral
The document is not very intuitive. I drew a railway map:
-
5. Define a variable
numberString
that istrimmedString
the longest prefix of (possiblytrimmedString
) itself,numberString
satisfyingStrDecimalLiteral
the syntax of . -
6. Define the variable
mathFloat
, which isnumberString
(MV
mathematical value): derived from the textMV
, and then round the value (there is also a 20-bit threshold processing). This step of processing isparseInt()
very different. As for the specific MV, it is basically what I learned in college, see the documentation ; -
7. If
mathFloat
equals 0, then:- If
trimmedString
the first character is equal to"-"
, then return-0
; - Otherwise return
+0
;
- If
-
8.The returned
mathFloat
Number value;
Also drew a flow chart:
parseFloat()
Only the leading part of the string can be interpreted as an integer value; it ignores any code units that cannot be interpreted as part of the integer symbol, and gives no indication that any such code units are ignored.
ts
Declaring a file ( lib.es5.d.ts
) is simple:
/**
* Converts a string to an integer.
* @param string A string to convert into a number.
* @param radix A value between 2 and 36 that specifies the base of the number in `string`.
* If this argument is not supplied, strings with a prefix of '0x' are considered hexadecimal.
* All other strings are considered decimal.
*/
declare function parseInt(string: string, radix?: number): number;
/**
* Converts a string to a floating-point number.
* @param string A string that contains a floating-point number.
*/
declare function parseFloat(string: string): number;
Pay attention
NaN
again"number"
Kernel implementation
Taking a typical WebKit (depending on v8) as an example, you can see the specific code implementation and single test content of parseInt()
and (version: )parseFloat
tags/9.9.56
parseInt
Source code
(Main document: /Source/JavaScriptCore/runtime/ParseInt.h
)
Main code:
// 入口,方法定义
ALWAYS_INLINE static double parseInt(StringView s, int radix)
{
if (s.is8Bit())
return parseInt(s, s.characters8(), radix);
return parseInt(s, s.characters16(), radix);
}
// ES5.1 15.1.2.2
template <typename CharType>
ALWAYS_INLINE
static double parseInt(StringView s, const CharType* data, int radix)
{
// 1. Let inputString be ToString(string).
// 2. Let S be a newly created substring of inputString consisting of the first character that is not a
// StrWhiteSpaceChar and all characters following that character. (In other words, remove leading white
// space.) If inputString does not contain any such characters, let S be the empty string.
int length = s.length();
int p = 0;
while (p < length && isStrWhiteSpace(data[p]))
++p;
// 3. Let sign be 1.
// 4. If S is not empty and the first character of S is a minus sign -, let sign be -1.
// 5. If S is not empty and the first character of S is a plus sign + or a minus sign -, then remove the first character from S.
double sign = 1;
if (p < length) {
if (data[p] == '+')
++p;
else if (data[p] == '-') {
sign = -1;
++p;
}
}
// 6. Let R = ToInt32(radix).
// 7. Let stripPrefix be true.
// 8. If R != 0,then
// b. If R != 16, let stripPrefix be false.
// 9. Else, R == 0
// a. LetR = 10.
// 10. If stripPrefix is true, then
// a. If the length of S is at least 2 and the first two characters of S are either ―0x or ―0X,
// then remove the first two characters from S and let R = 16.
// 11. If S contains any character that is not a radix-R digit, then let Z be the substring of S
// consisting of all characters before the first such character; otherwise, let Z be S.
if ((radix == 0 || radix == 16) && length - p >= 2 && data[p] == '0' && (data[p + 1] == 'x' || data[p + 1] == 'X')) {
radix = 16;
p += 2;
} else if (radix == 0)
radix = 10;
// 8.a If R < 2 or R > 36, then return NaN.
if (radix < 2 || radix > 36)
return PNaN;
// 13. Let mathInt be the mathematical integer value that is represented by Z in radix-R notation, using the letters
// A-Z and a-z for digits with values 10 through 35. (However, if R is 10 and Z contains more than 20 significant
// digits, every significant digit after the 20th may be replaced by a 0 digit, at the option of the implementation;
// and if R is not 2, 4, 8, 10, 16, or 32, then mathInt may be an implementation-dependent approximation to the
// mathematical integer value that is represented by Z in radix-R notation.)
// 14. Let number be the Number value for mathInt.
int firstDigitPosition = p;
bool sawDigit = false;
double number = 0;
while (p < length) {
int digit = parseDigit(data[p], radix);
if (digit == -1)
break;
sawDigit = true;
number *= radix;
number += digit;
++p;
}
// 12. If Z is empty, return NaN.
if (!sawDigit)
return PNaN;
// Alternate code path for certain large numbers.
if (number >= mantissaOverflowLowerBound) {
if (radix == 10) {
size_t parsedLength;
number = parseDouble(s.substring(firstDigitPosition, p - firstDigitPosition), parsedLength);
} else if (radix == 2 || radix == 4 || radix == 8 || radix == 16 || radix == 32)
number = parseIntOverflow(s.substring(firstDigitPosition, p - firstDigitPosition), radix);
}
// 15. Return sign x number.
return sign * number;
}
There are no "showy" operations in the code, and everything from execution to comments is completely in compliance with the specifications.
parseInt unit test
(File: chromium / v8 / v8 / 9.9.56 / . / test / webkit / parseInt-expected.txt
)
PASS parseInt('123') is 123
PASS parseInt('123x4') is 123
PASS parseInt('-123') is -123
PASS parseInt('0x123') is 0x123
PASS parseInt('0x123x4') is 0x123
PASS parseInt('-0x123x4') is -0x123
PASS parseInt('-') is Number.NaN
PASS parseInt('0x') is Number.NaN
PASS parseInt('-0x') is Number.NaN
PASS parseInt('123', undefined) is 123
PASS parseInt('123', null) is 123
PASS parseInt('123', 0) is 123
PASS parseInt('123', 10) is 123
PASS parseInt('123', 16) is 0x123
PASS parseInt('0x123', undefined) is 0x123
PASS parseInt('0x123', null) is 0x123
PASS parseInt('0x123', 0) is 0x123
PASS parseInt('0x123', 10) is 0
PASS parseInt('0x123', 16) is 0x123
PASS parseInt(Math.pow(10, 20)) is 100000000000000000000
PASS parseInt(Math.pow(10, 21)) is 1
PASS parseInt(Math.pow(10, -6)) is 0
PASS parseInt(Math.pow(10, -7)) is 1
PASS parseInt(-Math.pow(10, 20)) is -100000000000000000000
PASS parseInt(-Math.pow(10, 21)) is -1
PASS parseInt(-Math.pow(10, -6)) is -0
PASS parseInt(-Math.pow(10, -7)) is -1
PASS parseInt('0') is 0
PASS parseInt('-0') is -0
PASS parseInt(0) is 0
PASS parseInt(-0) is 0
PASS parseInt(2147483647) is 2147483647
PASS parseInt(2147483648) is 2147483648
PASS parseInt('2147483647') is 2147483647
PASS parseInt('2147483648') is 2147483648
PASS state = null; try { parseInt('123', throwingRadix); } catch (e) {} state; is "throwingRadix"
PASS state = null; try { parseInt(throwingString, throwingRadix); } catch (e) {} state; is "throwingString"
parseFloat
Source code
(Main document: /Source/JavaScriptCore/runtime/JSGlobalObjectFunctions.cpp
)
static double parseFloat(StringView s)
{
unsigned size = s.length();
if (size == 1) {
UChar c = s[0];
if (isASCIIDigit(c))
return c - '0';
return PNaN;
}
if (s.is8Bit()) {
const LChar* data = s.characters8();
const LChar* end = data + size;
// Skip leading white space.
for (; data < end; ++data) {
if (!isStrWhiteSpace(*data))
break;
}
// Empty string.
if (data == end)
return PNaN;
return jsStrDecimalLiteral(data, end);
}
const UChar* data = s.characters16();
const UChar* end = data + size;
// Skip leading white space.
for (; data < end; ++data) {
if (!isStrWhiteSpace(*data))
break;
}
// Empty string.
if (data == end)
return PNaN;
return jsStrDecimalLiteral(data, end);
}
// See ecma-262 6th 11.8.3
template <typename CharType>
static double jsStrDecimalLiteral(const CharType*& data, const CharType* end)
{
RELEASE_ASSERT(data < end);
size_t parsedLength;
double number = parseDouble(data, end - data, parsedLength);
if (parsedLength) {
data += parsedLength;
return number;
}
// Check for [+-]?Infinity
switch (*data) {
case 'I':
if (isInfinity(data, end)) {
data += SizeOfInfinity;
return std::numeric_limits<double>::infinity();
}
break;
case '+':
if (isInfinity(data + 1, end)) {
data += SizeOfInfinity + 1;
return std::numeric_limits<double>::infinity();
}
break;
case '-':
if (isInfinity(data + 1, end)) {
data += SizeOfInfinity + 1;
return -std::numeric_limits<double>::infinity();
}
break;
}
// Not a number.
return PNaN;
}
In contrast, parseInt
the annotations are more complete.
pressFloat test
(File: chromium / v8 / v8 / 9.9.56 / . / test / webkit / parseFloat-expected.txt
)
PASS parseFloat() is NaN
PASS parseFloat('') is NaN
PASS parseFloat(' ') is NaN
PASS parseFloat(' 0') is 0
PASS parseFloat('0 ') is 0
PASS parseFloat('x0') is NaN
PASS parseFloat('0x') is 0
PASS parseFloat(' 1') is 1
PASS parseFloat('1 ') is 1
PASS parseFloat('x1') is NaN
PASS parseFloat('1x') is 1
PASS parseFloat(' 2.3') is 2.3
PASS parseFloat('2.3 ') is 2.3
PASS parseFloat('x2.3') is NaN
PASS parseFloat('2.3x') is 2.3
PASS parseFloat('0x2') is 0
PASS parseFloat('1' + nonASCIINonSpaceCharacter) is 1
PASS parseFloat(nonASCIINonSpaceCharacter + '1') is NaN
PASS parseFloat('1' + illegalUTF16Sequence) is 1
PASS parseFloat(illegalUTF16Sequence + '1') is NaN
PASS parseFloat(tab + '1') is 1
PASS parseFloat(nbsp + '1') is 1
PASS parseFloat(ff + '1') is 1
PASS parseFloat(vt + '1') is 1
PASS parseFloat(cr + '1') is 1
PASS parseFloat(lf + '1') is 1
PASS parseFloat(ls + '1') is 1
PASS parseFloat(ps + '1') is 1
PASS parseFloat(oghamSpaceMark + '1') is 1
PASS parseFloat(mongolianVowelSeparator + '1') is NaN
PASS parseFloat(enQuad + '1') is 1
PASS parseFloat(emQuad + '1') is 1
PASS parseFloat(enSpace + '1') is 1
PASS parseFloat(emSpace + '1') is 1
PASS parseFloat(threePerEmSpace + '1') is 1
PASS parseFloat(fourPerEmSpace + '1') is 1
PASS parseFloat(sixPerEmSpace + '1') is 1
PASS parseFloat(figureSpace + '1') is 1
PASS parseFloat(punctuationSpace + '1') is 1
PASS parseFloat(thinSpace + '1') is 1
PASS parseFloat(hairSpace + '1') is 1
PASS parseFloat(narrowNoBreakSpace + '1') is 1
PASS parseFloat(mediumMathematicalSpace + '1') is 1
PASS parseFloat(ideographicSpace + '1') is 1
at last
From the ECMA specification and the code implementation of typical kernels, we can find that parseFloat
there parseInt
are many boundary processes, which is also the main reason for pitfalls.
At this point, you can go back and think about the original implementation problems, and most of them can be explained. As for the "super-class" question, if you are interested, you can take a look at the number and type conversion part of the ECMA specification.
appendix
tostring
MV(mathematical value)
parseFloat input string format
Diagram(
ZeroOrMore('Space'),
Optional(
Choice(0,
'+',
'-',
), 'skip'
),
Choice(1,
'Infinity',
Sequence(
Choice(0,
Sequence(
ZeroOrMore('0-9'),
Optional('.', 'skip'),
OneOrMore('0-9'),
),
),
Optional(
Sequence(
Choice(0,
'e',
'E',
),
Optional(
Choice(0,
'+',
'-',
), 'skip'
),
OneOrMore('0-9'),
)
, 'skip')
)
),
ZeroOrMore('Space'),
)
parseInt input string format
Diagram(
ZeroOrMore('Space'),
Optional(
Choice(0,
'+',
'-',
), 'skip'
),
ZeroOrMore('0-R'), // R 为进制最大值
ZeroOrMore('Space'),
)
If you have suggestions or reprints -> [email protected]
Related Links
- https://262.ecma-international.org/6.0/
- https://262.ecma-international.org/6.0/#sec-tostring-applied-to-the-number-type
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseInt
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseFloat
- https://webkit.org/
- https://github.com/WebKit/WebKit
- https://github.com/MichealWayne/study-js-from-questions/blob/master/1.1%20MemoryHeap.md