Can you spot the bug in this Python code?

Click the link to learn more

img


[CSDN Editor's Note] Let's find bugs together.

Original link: https://dwrodri.gitlab.io/can-you-spot-the-bug-in-this-python-code/

This article has been authorized by the author and may not be reproduced without permission!

Author | Derek Rodriguez

Translator | Crescent Moon

Editor | Xia Meng

Listing | CSDN (ID: CSDNnews)

Recently, I encountered a very interesting problem while parsing text. Before we get into that, let me give you a little backstory. My task is to parse some comma separated data from a text file like this:

img

This text file contains several variable-width hexadecimal values, with at least three fields per line. I only care about the first and third fields. In my opinion, the analysis work can be divided into three steps:

  1. Read each row of data in a loop;
  2. Use commas to break the data into a list;
  3. Picks the first and third elements and converts them to integers.

It seems simple, I can write a few lines of code using pandas DataFrame and it's enough.

Below is the code I wrote:

img

Did you find the bug? Anyway, I didn’t see it. Now, let me explain this code in detail and dig into where I went wrong.

img

Detailed code explanation

CSV file is a list of lists

I simply think that CSV data is a list of lists. So I can treat the individual elements as embedded lists. I found the code to read the embedded list from a post online and copied and pasted:

nested_lists = [[1,2,3],[4,5,6],[7,8,9]]flattened_list = [element for sublist in nested_lists for element in sublist]

I had been exposed to C and C++ before learning Python, so when learning nested comprehensions, I felt that Python was just pseudocode that the machine could understand. This nested list generates the following bytecode:

img

Then I extended some of my own code and ended up with the following code:

img

img

mistake

It turns out that Python can't combine iterable text decomposition with comprehensions the way I imagined, you have to put the .split(",") call in another list:

img

This is a bit nerve-wracking for me, because .split(",") is itself a list. Packing it into another list, doesn't it become a double nested list? I do not quite understand. I tried looking for the answer via compiler browser. The image below shows the difference between the correct generator expression and the code I wrote:

img

Do you see the problem? The problem in your code is that the return value of .split() is an iterator before splitting the text. I'm not sure, but I believe it has to do with implementation details established when list comprehensions were first proposed.

Finally, I solved this problem with the help of CPython contributor Crowthebird, who demonstrated the problem of rewriting the code without using comprehensions.

Wrong way of writing:

img

Correct way to write:

img

img

Can this problem be solved?

This is actually because my understanding of the Python interpreter is wrong, there is nothing wrong with the interpreter itself. I don't think it would be better to modify the language as I understand it, since it is so difficult to distinguish when a container should be destructured and when it should be reused in the case of nesting, plus list comprehensions return tuples, which is not allowed by PEP 202 .

img

Guess you like

Origin blog.csdn.net/CODING_devops/article/details/132344502