python string intern

Reference What is string interning (string resides) and intern mechanism python string in
the string using the Python intern mechanism automatically intern.
Incomputer OF S., String interning IS A Method of Storing only onecopy String value of each DISTINCT .., which must be immutable interning strings makes some stringprocessing tasks more time- or space-efficient at the cost of requiring moretime when the string is created or interned The distinct values are stored ina string intern pool -. quote from Wikipedia
that is, He says, the same value will only keep a string object is shared, which also determines the string must be immutable.
String in Python using the intern mechanism automatically intern

      
      
1
2
3
4
5
6
      
      
>>a = 'kzc'
>>b = 'k'+'zc'
>>id(a)
55704656
>>id(b)
55704656

You can see, they are the same object. (Java string assignment may also be used directly to determine ==, but using the new instantiated objects is required equals (String s) to determine)
the benefits intern mechanism that time (such as identifying the required value of the same string Fu), brought directly from the pool by avoiding frequent creation and destruction, improve efficiency and save memory. The disadvantage is, string concatenation, modifications affect the performance of the string or the like. Because it is immutable, the string is not a modification inplace operation, to create a new object. This is why when splicing multi-string is not recommended for use with + and join (). join () is to calculate the length of all strings, then 11 copies, only one new object.
Need to be careful of the pit, not all strings will adopt intern mechanism. Consists of only underscores, numbers, letters, the string will be intern.
Since the built-in function python intern () can be any string of explicit intern. The issue is not difficult to achieve.
The answer comments in the source code can be found in stringobject.h,
/ ... ... This IS GeneRally Tel Restricted tostrings that "looklike" Python identifiers, although but at The Intern () builtincan BE Used to Force interning of the any String ... ... /
In other words, only those look like python identifiers were intern.

Another pit see below,
Example 1.

      
      
1
2
      
      
>>'kz'+'c' is 'kzc'
True

Example 2.

Big Box   Python String Intern the y->
      
      
1
2
3
4
      
      
>>s1 = 'kz'
>>s2 = 'kzc'
>>s1+'c' is 'kzc'
False

Why is the second chestnuts False, only contain letters, ah, should not be automatically intern is it?
This is because the first chestnut, 'kz' + 'c' was evaluated at compile time, to be replaced with 'kzc'.
While the second chestnut, s1 + 'c' in the run-time splicing, not automatically lead to intern.

Reference 1 is a tricky problem operator
reference in python is, == and CMP () Comparative string
"is" operator behaves unexpectedly with integers

      
      
1
2
3
4
5
6
7
8
9
10
11
12
      
      
>>> a = 256
>>> b = 256
>>> id(a)
9987148
>>> id(b)
9987148
>>> a = 257
>>> b = 257
>>> id(a)
11662816
>>> id(b)
11662828

the reason:

This is really hardcoded limit in the current CPython implementation
The interpreter preallocates numbers from 0 to 256.

intobject.c的有关这部分的源码:
#ifndef NSMALLPOSINTS
#define NSMALLPOSINTS 257
#endif
#ifndef NSMALLNEGINTS
#define NSMALLNEGINTS 5
#endif
#if NSMALLNEGINTS + NSMALLPOSINTS > 0
/ References to small integers are saved in this array so that they
can be shared.
The integers that are saved are those in the range
-NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
/
static PyIntObject *small_ints[NSMALLNEGINTS + NSMALLPOSINTS];
#endif
#ifdef COUNT_ALLOCS
int quick_int_allocs, quick_neg_int_allocs;
#endif

Guess you like

Origin www.cnblogs.com/sanxiandoupi/p/11699246.html