Memory mechanism python variables

As an easy to use language, and with massive libraries, python can be described as a programmer in the hands of the palm-sized, programming itself is a converted human mind is thinking of computer technology, if not to the pursuit of the ultimate operating efficiency at the same time not limited to computer memory, python is undoubtedly the most convenient language.

As a qualified programmers, we naturally have to know these and know why, in addition to use python to fly beyond the self, but also to explore the interior of the python its operating principle, bear the brunt of the python programming must be used to variables and operating mechanism behind it.

Note: The following example written in linux platform, using python2.7

Reference mechanism

python variables - memory model is more like C ++ references mechanism, python each variable may not take up memory space, more like a reference to the memory variable, you can access the data in memory by this variable, give example:

>>>a=10
>>>b=a
>>>c=[1,2,3,4]
>>>d=c
>>>print "%x%x"  %(id(a),id(b))
>>>print "%x%x"  %(id(c),id(d))

Output:

b51080.b51080
7f28bf69b758.7f28bf69b758  

Where id () is a function of the python system returns the object memory starting address.

From the results, a and b, c and d corresponding to the address of the variable is in fact the same address, that is when we use the variables a and B, use the same object, and a, b are the object reference, we can () to view an object through the system function sys.getrefcount reference number:

>>>import sys
>>>a=257
>>>print sys.getrefcount(a)
>>>b=a
>>>print sys.getrefcount(a)

Output:

2
3

Obviously, this does not result in our expectations of them, due to the a and b at the same address, the result should be 1,2, 2,3 Why is it?

This is because sys.getrefcount () function call, a has also been cited as an argument once, so there results 2,3.

Small data caching mechanism

We mentioned above memory mechanism when the python variable assignment, things are so perfect end?

not at all! ! !

Let us look at an example:

>>> a=10
>>> b=10
>>> print "%x.%x" %(id(a),id(b))

Output:

b51080.b51080

See the results, I slowly took off my glasses, take 95% of the concentration of medical alcohol carefully wipe it three times and then put on after watching, not wrong! These two variables refer to the same address or the content, this time to initialize the two variables are independent, the assignment is not initialized, or why the two variables refer to the same address it?

the answer is:

在Python中,Python会有一个缓存对象的机制,以便重复使用。当我们创建多个等于1的引用时,实际上是让所有这些引用指向同一个对象,以达到节省资源的目的  

So this is! ! !

But think carefully, it does not, right? If each data cache, it would mean that extreme waste of memory space? Or memory recovery mechanism will be recovered over a period of time once the garbage memory?
Let us look at an example:

>>> a=100
>>> b=100
>>> print "%d%d" %(id(a),id(b))
>>> a=256
>>> b=256
>>> print "%d%d" %(id(a),id(b))
>>> a=257 
>>> b=257
>>> print "%d%d" %(id(a),id(b))

Output:

5223836.5223836
5225932.5225932
5241840.5241864  

From the results, when a is less than 256, this value will be recycled system cache, and when a> 256, the cache system is not (of course, just a result of three experiments, bloggers also tried many subsequent value, not to list)

We come another way to verify this problem, namely sys.getrefcount ():

>>>import sys
>>>a=10
>>>print sys.getrefcount(a)
>>>a=257
>>>print sys.getrefcount(a)

The output is:

15
2

The result is obvious, this value is the system cache 10, and elsewhere a number of references, and the value of 2 257 (2 Why instead are explained in the above 1)

So the question again, if other types of data it? We then see

>>>a="downey"
>>>b="downey"
>>>print "%d%d" %(id(a),id(b))

The results are:

39422528.39422528

There will be a short string caching mechanism

Then the list:

>>>a=[1,2,3]
>>>b=[1,2,3]
>>>print "%d%d" %(id(a),id(b))
39704576.39745176

list and no caching mechanism, from here you can see, python caching mechanisms are not for all types of variables

Variable cache Conclusion

According to various experiments and investigations show, the results showed that:

  • python variable is actually a reference to the heap memory, the tag can be understood as an entity, and copying the copy (e.g., a = b) between the different variables, objects they are represented by the same entity
  • python will -5-256 (including the 256) and a short string of integer data cache in order to save the overhead of multiple distribution destruction

See here, like thinking of my friends can not help but want to ask, and integer data caches these short strings really have significantly improved the performance of it? python code can have a number of integer variables?

The answer is: an integer corresponding to an integer variable memory object, but the object is not only a memory integer corresponding to the variable type integer, the container may also be an element shaping integer variable references

If you still have doubts, we take a look at the following example:

>>> import sys
>>> a=1
>>> sys.getrefcount(a)
128
>>> b=[1,2,3]
>>> sys.getrefcount(a)
129

From the results of the printing, the integer variable a = 1, 1 denotes a pointing object, as a reference 1, b [0] are also initialized to 1, the same, b [0] is also a reference to the object 1, for all containers are of this form, see here, ladies and gentlemen of the audience should be an understanding of it.

About python variables affect the mechanism of memory variables can refer to this blog post: When python function call parameter passing

Garbage collection

Now comes the memory mechanism, the mechanism must involve the distribution and recovery of memory allocation is very simple, when used in the definition of the object allocated memory, and the memory recycling is not so simple, because in the process of memory recovery, python can not perform other tasks, so frequent garbage collection can cause serious efficiency problems, and memory recovery interval is too long can lead to serious waste of memory, it is generally only start garbage collection within a specific time.

python runs, and the number of recorded allocation release, only when the difference between the two values ​​is greater than a certain value, i.e.,

分配次数-释放次数>触发回收的阈值

When, python garbage collection, we can use get_threshold () method to get the threshold:

>>>import gc
>>>print gc.get_threshold()

Output:

(700,10,10)

The 700 is the trigger threshold of recovered memory. But two 10 back then what does it mean?

It is also a mechanism for the recovery of memory, called generational recovery, basic assumption of this strategy is: the longer the time the object exists, the less likely to become garbage objects that give long-term use of some objects more confidence.

Python 0,1,2 all subjects were divided into three generations. All objects are new generation 0 objects. When an object is experienced generation garbage collection, still alive, then it is classified as the next target. When garbage collection starts, it will scan all of the generation 0 objects. If generation 0 garbage collection after a certain number of times, then start the scanning and cleaning of generation 0 and generation 1. When a generation has gone through a certain number of garbage collection, it will start to 0,1,2, that scans all objects

I.e., two times the above get_threshold ((700, 10, 10) two 10 returns returned). In other words, every 10 generation 0 garbage collection, will meet once a generation 1 garbage collection; and garbage collection every 10 generation, will have 1 generation 2 garbage collection.

We can also manually adjust the trigger threshold of recovery, smart friends can guess this method, and since there are get, is bound to the corresponding set:

import gc
gc.set_threshold(600,8,7)

In addition to passively wait for the system recovery, of course, can be performed manually memory recall:

import gc
gc.collect()  

In fact, Ye Hao java, python or, mechanisms of memory each language will fundamentally affect the efficiency of the language, so there will be a lot more intricate details on the process of memory, introduced here only a general framework for trespass , welcomed the passing of the great God who corrections and additions.

Well, the problem about the python variable memory mechanism to stop here, if my friends or have any questions for this article found in what is wrong, welcome message

Personal E-mail: [email protected]
original blog, please indicate the source!

I wish you an early realization of the project through the thicket, bug does not Zhanshen.
(End)

Published 296 original articles · won praise 221 · views 540 000 +

Guess you like

Origin blog.csdn.net/qq_36387683/article/details/104988710