Why can Python exchange two variables directly with just one statement "a,b=b,a"?

From the time I got in touch with Python, I thought Python's tuple unpacking was very interesting, very concise and easy to use.

The most obvious example is multiple assignment, where multiple variables are assigned values ​​in a single statement:

>>> x, y = 1, 2
>>> print(x, y)  # 结果:1 2

In this example, the two numbers on the right side of the assignment operator "=" will be stored in a tuple, which becomes (1,2), then unpacked and assigned to "=" in turn The two variables to the left of the number.

This can be confirmed if we write it directly x = 1,2and then print out x, or write a tuple to the right of the "=" sign:

>>> x = 1, 2
>>> print(x)     # 结果:(1, 2)
>>> x, y = (1, 2)
>>> print(x, y)  # 结果:1 2

When some blogs or official account articles introduce this feature, they usually give an example, that is, based on two variables, directly exchange their values:

>>> x, y = 1, 2
>>> x, y = y, x
>>> print(x, y) # 结果:2 1

In general, the operation of swapping two variables requires the introduction of a third variable. The reason is very simple, if you want to exchange the water contained in the two cups, you will naturally need a third container as a relay.

However, Python's writing method does not need to use intermediate variables, and its form is the same as the previous unpacking assignment. Because of this similarity in form, many people mistakenly believe that Python's variable exchange operations are also based on unpacking operations.

But is this the case?

I googled and found that someone has tried to answer this question, but their answer is basically not comprehensive enough. (Of course, there are quite a few wrong answers, and many more who just know it and never think to know it)

Let's put the answer to this article first: Python's swap variable operations are not entirely based on unpacking operations, sometimes yes, sometimes not!

Do you think this answer is amazing? Is it unheard of? !

What is actually happening? Let's take a look at the simplest two variables in the title first, let's go to the disbig killer to see the compiled bytecode:

The above figure opens two windows, which can be easily compared between "a,b=b,a" and "a,b=1,2":

  • "a,b=b,a" operation: The two LOAD_FAST are to read the reference of the variable from the local scope and store it on the stack, followed by the most critical ROT_TWO operation, which will exchange the reference value of the two variables, Then two STORE_FAST is to write the variable on the stack into the local scope.
  • "a,b=1,2" operation: the first step LOAD_CONST puts the two numbers on the right side of the "=" sign on the stack as a tuple, the second step UNPACK_SEQUENCE is the sequence unpacking, and then the unpacking result is written into local scope variables.

Obviously, the two forms of writing that are similar in form actually accomplish different operations. In the operation of swapping variables, there are no packing and unpacking steps!

The ROT_TWO instruction is a shortcut operation implemented by the CPython interpreter for the top two elements of the stack, changing the reference objects they point to.

There are also two similar instructions ROT_THREE and ROT_FOUR, which are quick swaps of three and four variables respectively (taken from: ceval.c file, the latest 3.9 branch):

The predefined top stack operations are as follows:

Check the official documentation for the explanation of these instructions, where ROT_FOUR is newly added in version 3.8:

> - ROT_TWO > > Swaps the two top-most stack items. > > > - ROT_THREE > > Lifts second and third stack item one position up, moves top down to position three. > > > - ROT_FOUR > > Lifts second, third and forth stack items one position up, moves top down to position four. > New in version 3.8.

CPython should think that the exchange of these kinds of variables is very common, so it provides special optimization instructions. Just like [-5,256] these small integers are prepended to the integer pool.

For the exchange operation of more variables, the unpacking operation mentioned above is actually used:

The BUILD_TUPLE instruction in the screenshot will create a tuple of a given number of top elements of the stack, which are then unpacked by the UNPACK_SEQUENCE instruction, and then assigned sequentially.

It is worth mentioning that the reason why there is one more build operation here than the previous "a,b=1,2" is because the LOAD_FAST of each variable needs to be pushed onto the stack separately first, and cannot be directly combined into LOAD_CONST and pushed onto the stack. That is to say, when there is a variable on the right side of the "=" sign, there will not be a tuple of LOAD_CONST in the preceding paragraph.

One last detail worth mentioning is that those instructions are related to the number of elements on the stack, not the number of variables actually swapped in the assignment statement. Just look at an example:

At this point of the analysis, you should understand what happened to the conclusion in the previous article, right?

Let's summarize a little:

  • Python can implement multiple assignments in a single statement, which takes advantage of the sequence unpacking feature
  • Python can implement variable exchange in one statement without introducing intermediate variables. When the number of variables is less than 4 (less than 5 since version 3.8), CPython uses the ROT_* instructions to exchange elements in the stack. When the number of variables exceeds, the feature of sequence unpacking is used.
  • Sequence unpacking is a big feature of Python, but in the example in this article, the CPython interpreter also provides several optimized instructions in small operations, which is definitely beyond most people's cognition

If you think this article is a good analysis, then you should like these articles:

1. Why does Python use indentation to divide code blocks?

2. Is Python's indentation an anti-human design?

3. Why does Python not use a semicolon as a statement terminator?

4. Why does Python not have a main function? Why do I not recommend writing the main function?

5. Why does Python recommend snake-like nomenclature?

6. Why does Python not support the i++ auto-increment syntax and the ++ operator?

Written at the end: This article belongs to the "Why Python" series (produced by Python Cat), which mainly focuses on topics such as Python's syntax, design, and development, and tries to show the charm of Python by starting with "why" questions. charm. Some topics will have a video version, please watch at station B, watch address: video address

The public number [ Python Cat ], this number serializes a series of high-quality articles, including why Python series, Cat Philosophy series, Python advanced series, good book recommendation series, technical writing, high-quality English recommendation and translation, etc. Welcome to pay attention .

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324125841&siteId=291194637