Python Advanced Guide (Easy Advanced Programming): 8. Common Python pitfalls

Original: http://inventwithpython.com/beyond/chapter8.html

While Python is my favorite programming language, it's not without its flaws. Every language has shortcomings (some more than others), and Python is no exception. New Python programmers must learn to avoid some common "gotchas" Programmers learn this kind of knowledge at random, from experience, but this chapter collects it in one place. Knowing the programming behind these pitfalls can help you understand why Python sometimes behaves strangely.

This chapter explains how mutable objects like lists and dictionaries can behave strangely when you modify their contents. You'll learn sort()how methods don't sort items alphabetically, and how floating-point numbers can have rounding errors. When you chain inequality operators !=together, they have unusual behavior. And when writing a tuple containing a single item, you must use a trailing comma. This chapter tells you how to avoid these common pitfalls.

Don't add or remove items while iterating over the list

Adding or removing items from a list is likely to cause bugs when traversing (i.e., foriterating ) the list with or . Consider a scenario where you want to iterate over a list of strings describing clothes and ensure there is an even number of socks by inserting a matching sock each time you find one in the list. The task seems simple: loop through the strings in the list, and when found in one , for example , append another string to the list.while'sock''red sock''red sock'

But this code doesn't work. It gets stuck in an infinite loop and you have to break it Ctrl+Cby :

>>> clothes = ['skirt', 'red sock']
>>> for clothing in clothes:  # Iterate over the list.
...    if 'sock' in clothing:  # Find strings with 'sock'.
...        clothes.append(clothing)  # Add the sock's pair.
...        print('Added a sock:', clothing)  # Inform the user.
...
Added a sock: red sock
Added a sock: red sock
Added a sock: red sock
`--snip--`
Added a sock: red sock
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
KeyboardInterrupt

You will autbor.com/addingloopsee the visual execution of this code.

The problem is that when you 'red sock'append to clothesthe list, the list now has a new third item that it has to iterate over: ['skirt', 'red sock', 'red sock']. forThe loop reaches the second on the next iteration 'red sock', so it appends another'red sock' string. This makes the list ['skirt', 'red sock', 'red sock', 'red sock']another string that Python iterates over. This will continue to happen, as shown in Figure 8-1, which is why we see a never-ending 'Added a sock.'flow of messages. The loop only stops when the computer runs out of memory and crashes the Python program, or until you interrupt it by Ctrl+Cpressing .

f08001

Figure 8-1: On foreach iteration of the loop, a new one 'red sock'is added to the list, clothingwhich is referenced in the next iteration. This cycle repeats forever.

The point is not to add entries to the list while iterating over it. Instead, use a separate list for the contents of the new, modified list, like in this example newClothes:

>>> clothes = ['skirt', 'red sock', 'blue sock']
>>> newClothes = []
>>> for clothing in clothes:
...    if 'sock' in clothing:
...        print('Appending:', clothing)
...        newClothes.append(clothing) # We change the newClothes list, not clothes.
...
Appending: red sock
Appending: blue sock
>>> print(newClothes)
['red sock', 'blue sock']
>>> clothes.extend(newClothes)  # Appends the items in newClothes to clothes.
>>> print(clothes)
['skirt', 'red sock', 'blue sock', 'red sock', 'blue sock']

Visual execution of this code autbor.com/addingloopfixedin progress .

Our forloop iterates over clothesthe items in the list, but doesn't modify what's inside the loop clothes. Instead, a separate list, newClothes. Then, after the loop, we newClothesmodify by expanding with the content of clothes. You now have a list of matching socks clothes.

Likewise, you should not delete items in a list while iterating over it. Consider a piece of code where we want to remove from a list any 'hello'string that is not. The easiest way is to iterate over the list, removing non-matching entries:

>>> greetings = ['hello', 'hello', 'mello', 'yello', 'hello']
>>> for i, word in enumerate(greetings):
...    if word != 'hello':  # Remove everything that isn't 'hello'.
...        del greetings[i]
...
>>> print(greetings)
['hello', 'hello', 'yello', 'hello']

Visual execution of this code autbor.com/deletingloopin progress .

There seems to be more left on the list 'yello'. The reason is that when forthe loop checks the index 2, it is removed from the list 'mello'. But this moves all remaining entries in the list down one index, going 'yello'from index 3to index 2. The next iteration of the loop checks the index 3, which is now the last one 'hello', as shown in Figure 8-2. The 'yello'string slipped away in a daze! Don't remove items from the list while iterating over the list.

f08002

Figure 8-2: When deleting in a loop 'mello', the items in the list are moved down one index, causing ia skip 'yello'.

Instead, create a new list, copy all items except the one you want to remove, and replace the original list. For the error-free equivalent of the previous example, enter the following code in an interactive shell.

>>> greetings = ['hello', 'hello', 'mello', 'yello', 'hello']
>>> newGreetings = []
>>> for word in greetings:
...    if word == 'hello':  # Copy everything that is 'hello'.
...        newGreetings.append(word)
...
>>> greetings = newGreetings  # Replace the original list.
>>> print(greetings)
['hello', 'hello', 'hello']

Visual execution of this code autbor.com/deletingloopfixedin progress .

Remember, because this code is just a simple loop that creates a list, you can replace it with a list comprehension. A list comprehension doesn't run faster or use less memory, but it's shorter without losing much readability. Enter the following into the interactive shell, which is equivalent to the code in the previous example:

>>> greetings = ['hello', 'hello', 'mello', 'yello', 'hello']
>>> greetings = [word for word in greetings if word == 'hello']
>>> print(greetings)
['hello', 'hello', 'hello']

Not only is the comprehension of the list more concise, it also avoids the problems that arise when the list is mutated while iterating over it.


references, memory usage, andsys.getsizeof()

This seems like a waste of memory by creating a new list instead of modifying the original one. But remember that just like variables technically contain references to values ​​rather than actual values, lists contain references to values. The line shown earlier newGreetings.append(word)doesn't copy wordthe string in the variable, just the reference to the string, which is much smaller.

sys.getsizeof ()You can see this by using a function that returns the number of bytes the object passed to it takes up in memory. In this interactive shell example, we can see that the short string 'cat'takes 52 bytes and the long string takes 85 bytes:

>>> import sys
>>> sys.getsizeof('cat')
52
>>> sys.getsizeof('a much longer string than just "cat"')
85

(In the version of Python I'm using, the overhead of the string object takes 49 bytes, and each actual character in the string takes 1 byte.) But a list containing any of these strings takes 72 bytes, no matter how long the string is:

>>> sys.getsizeof(['cat'])
72
>>> sys.getsizeof(['a much longer string than just "cat"'])
72

The reason is that, technically, lists don't contain strings, but just references to strings, and the references are the same size regardless of the size of the data being referenced. Similar newGreetings.append(word)code doesn't copy wordthe string in , but a reference to the string. If you want to know how much memory an object and all the objects it references take up, Python core developer Raymond Hettinger wrote a function for this, which you can code.activestate.com/recipes/577504-compute-memory-footprint-of-an-object-and-its-contaccess at .

So you shouldn't feel like it's a waste of memory to create a new list instead of modifying the original while iterating. Even if your list-modifying code appears to work, it can be the source of subtle bugs that take a long time to find and fix. Wasting a programmer's time is far more expensive than wasting a computer's memory.


Although you shouldn't add or remove items from a list while iterating over it (or any iterable), it's fine to modify the contents of the list. For example, we have a list of numbers in string form: ['1', '2', '3', '4', '5']. We can convert this list of strings into a list of integers while iterating over the list [1, 2, 3, 4, 5]:

>>> numbers = ['1', '2', '3', '4', '5'] 
>>> for i, number in enumerate(numbers):
...    numbers[i] = int(number)
...
>>> numbers 
[1, 2, 3, 4, 5]

Visual execution of this code autbor.com/covertstringnumbersin progress . Modifying the items in the list does the trick; it changes the number of errors-prone entries in the list.

Another possible way to safely add or remove entries from a list is to iterate backwards from the end of the list to the beginning. This way, you can remove items from the list, or add items to the list, as long as they are added to the end of the list while iterating over it. For example, enter the following code, which someIntsremoves even integers from a list.

>>> someInts = [1, 7, 4, 5]
>>> for i in range(len(someInts)):
...
...    if someInts[i] % 2 == 0:
...        del someInts[i]
...
Traceback (most recent call last):
 File "<stdin>", line 2, in <module>
IndexError: list index out of range
>>> someInts = [1, 7, 4, 5]
>>> for i in range(len(someInts) - 1, -1, -1):
...    if someInts[i] % 2 == 0:
...        del someInts[i]
...
>>> someInts
[1, 7, 5]

This code works because the index of all future items iterated by the loop is unchanged. But the repeated shift up of values ​​after the deleted value makes this technique inefficient for long lists. Visual execution of this code autbor.com/iteratebackwards1in progress . You can see the difference between forward iteration and backward iteration in Figure 8-3.

f08003

Figure 8-3: Removing even numbers from a list when iterating forward (left) and backward (right)

Similarly, when you traverse a list backwards, you can add items to the end of the list. Enter the following in an interactive shell, and it will someIntsappend a copy of any even number in the list to the end of the list:

>>> someInts = [1, 7, 4, 5]
>>> for i in range(len(someInts) - 1, -1, -1):
...    if someInts[i] % 2 == 0:
...        someInts.append(someInts[i])
...
>>> someInts
[1, 7, 4, 5, 4]

Visual execution of this code autbor.com/iteratebackwards2in progress . By iterating backwards, we can add or remove entries from the list. But this can be difficult to get right, as small changes to this basic technique can end up introducing bugs. Creating a new list is much simpler than modifying the original. As Python core developer Raymond Hettinger puts it:

  1. Q: What is the best practice for modifying a list while looping through it?
  2. A: Don't do it.

Don't copy mutable values ​​without using copy.copy()andcopy.deepcopy()

Variables are best thought of as labels or name tags that refer to objects, rather than as boxes that contain objects. This mental model is especially useful when modifying mutable objects: objects such as lists, dictionaries, and collections whose values ​​can change (i.e. change). A common problem arises when copying one variable that references a mutable object to another, thinking that the actual object is being copied. In Python, assignment statements never copy objects; they only copy a reference to an object. (Python developer Ned Batchelder had a great talk on this idea at PyCon 2015, titled "Facts and Misconceptions About Python Names and Values." Watch it here. youtu.be/_AEJHKGk9ns)

For example, enter the following code in an interactive shell and notice that spamthe variable cheeseis changed even though we only changed it:

>>> spam = ['cat', 'dog', 'eel']
>>> cheese = spam
>>> spam 
['cat', 'dog', 'eel']
>>> cheese 
['cat', 'dog', 'eel']
>>> spam[2] = 'MOOSE'
>>> spam 
['cat', 'dog', 'MOOSE']
>>> cheese
['cat', 'dog', 'MOOSE']
>>> id(cheese), id(spam)
2356896337288, 2356896337288

Visual execution of this code autbor.com/listcopygotcha1in progress . If you thought cheese = spamthe list object was copied, you might be surprised that cheeseit seems to have changed, even though we only modified it spam. But assignment statements never copy objects , only references to objects . The assignment statement cheese = spamcauses cheese the reference to spamthe same list object as it is in computer memory. It doesn't copy the list object. That's why change spamalso changes cheese: both variables refer to the same list object.

The same principle applies to mutable objects passed to function calls. Enter the following into the interactive shell, noting that both global variables and local parameters (remember, parameters are variables defined within the spamfunction's statement) point to the same object:deftheList

>>> def printIdOfParam(theList):
...    print(id(theList))
...
>>> eggs = ['cat', 'dog', 'eel']
>>> print(id(eggs))
2356893256136
>>> printIdOfParam(eggs)
2356893256136

Visual execution of this code autbor.com/listcopygotcha2in progress . Note that the IDs id()for eggsand theListreturned are the same, meaning these variables refer to the same list object. eggsThe variable's list object is not copied theList; instead, the reference is copied, which is why two variables refer to the same list. A reference is only a few bytes in size, but imagine if Python copied the entire list instead of just the reference. eggsPassing it to printIdOfParam()a function would require copying this huge list if it contained a billion entries instead of three. Just doing a simple function call consumes gigabytes of memory! That's why Python assignments only copy references, never objects.

One way to prevent this is to copy.copy()copy list objects (not just references) with functions. Enter the following in the interactive shell:

>>> import copy
>>> bacon = [2, 4, 8, 16]
>>> ham = copy.copy(bacon)
>>> id(bacon), id(ham)
(2356896337352, 2356896337480)
>>> bacon[0] = 'CHANGED'
>>> bacon
['CHANGED', 4, 8, 16]
>>> ham
[2, 4, 8, 16]
>>> id(bacon), id(ham)
(2356896337352, 2356896337480)

The visual execution of this code is autbor.com/copycopy1on . hamThe variable refers to a copied list object, not baconthe original list object referenced by , so it doesn't suffer from this problem.

But just like variables are like labels or nametags instead of boxes containing objects, lists also contain labels or nametags that refer to objects instead of actual objects. If your list contains other lists, copy.copy()only copy references to those inner lists. Enter the following in an interactive shell to view the problem:

>>> import copy
>>> bacon = [[1, 2], [3, 4]]
>>> ham = copy.copy(bacon)
>>> id(bacon), id(ham)
(2356896466248, 2356896375368)
>>> bacon.append('APPENDED')
>>> bacon
[[1, 2], [3, 4], 'APPENDED']
>>> ham
[[1, 2], [3, 4]]
>>> bacon[0][0] = 'CHANGED'
>>> bacon
[['CHANGED', 2], [3, 4], 'APPENDED']
>>> ham
[['CHANGED', 2], [3, 4]]
>>> id(bacon[0]), id(ham[0])
(2356896337480, 2356896337480)

Visual execution of this code autbor.com/copycopy2in progress . Although baconand hamare two different list objects, they refer to the same [1, 2]and [3, 4]internal lists, so changes to those internal lists are reflected in both variables, even if we use copy.copy(). The solution is to use copy.deepcopy(), which will copy any list objects in the list object being copied (and any list objects in those list objects, etc.). Enter the following in the interactive shell:

>>> import copy
>>> bacon = [[1, 2], [3, 4]]
>>> ham = copy.deepcopy(bacon)
>>> id(bacon[0]), id(ham[0])
(2356896337352, 2356896466184)
>>> bacon[0][0] = 'CHANGED'
>>> bacon
[['CHANGED', 2], [3, 4]]
>>> ham
[[1, 2], [3, 4]]

Visual execution of this code autbor.com/copydeepcopyin progress . While slightly slower copy.deepcopy()than copy.copy(), it's safer to use if you don't know whether the list being copied contains other lists (or other mutable objects like dictionaries or sets). My general advice is to always use copy.deepcopy(): it may prevent subtle bugs, and your code may not be noticed.

Don't use mutable values ​​as default parameters

Python allows you to set default parameters for parameters in functions you define. If no parameters are explicitly set by the user, the function will be executed with default parameters. This is useful when most calls to the function use the same argument, since default arguments make the argument optional. For example, split()passing a method Nonemakes it split on whitespace, but Nonealso the default argument: call does the same thing 'cat dog'.split()as call . 'cat dog'.split(None)The function uses default arguments for arguments unless the caller passes in one. *

But you should not set a mutable object, such as a list or dictionary, as a default parameter. To see how this can lead to errors, look at the following example, which defines a addIngredient()function that adds an ingredient string to a list representing a sandwich. Since the first and last items of this list are usually 'bread', a mutable list ['bread', 'bread']is used as the default argument:

>>> def addIngredient(ingredient, sandwich=['bread', 'bread']):
...    sandwich.insert(1, ingredient)
...    return sandwich
...
>>> mySandwich = addIngredient('avocado')
>>> mySandwich
['bread', 'avocado', 'bread']

But using a mutable object like a ['bread', 'bread']list like this as a default parameter has a subtle problem: the list is defcreated when the function's statement is executed, not every time the function is called. This means that only one list object is created ['bread', 'bread']because we only defined the function once . But each function call will addIngredient()reuse this list. This can lead to unexpected behavior, as follows:

>>> mySandwich = addIngredient('avocado')
>>> mySandwich
['bread', 'avocado', 'bread']
>>> anotherSandwich = addIngredient('lettuce')
>>> anotherSandwich
['bread', 'lettuce', 'avocado', 'bread']

The function returns because addIngredient('lettuce')it ends up using the same default argument list as the previous call, which had been added 'avocado'instead of . Because the argument list is the same as the last function call, the string appears again. Only one list is created because the statement of the function is executed only once, not every time the function is called. Visual execution of this code in progress .['bread', 'lettuce', 'bread']['bread', 'lettuce', 'avocado', 'bread']sandwich'avocado'['bread', 'bread']defautbor.com/sandwich

If you need to use a list or dictionary as a default argument, the Python-style solution is to set the default argument to None. Then write code to check this and provide the new list or dictionary when calling the function. This ensures that the function creates a new mutable object each time the function is called, rather than calling the function only once when the function is defined , as in the following example:

>>> def addIngredient(ingredient, sandwich=None):
...    if sandwich is None:
...        sandwich = ['bread', 'bread']
...    sandwich.insert(1, ingredient)
...    return sandwich
...
>>> firstSandwich = addIngredient('cranberries')
>>> firstSandwich
['bread', 'cranberries', 'bread']
>>> secondSandwich = addIngredient('lettuce')
>>> secondSandwich
['bread', 'lettuce', 'bread']
>>> id(firstSandwich) == id(secondSandwich)
False # 1

Note that firstSandwichand secondSandwich1 do not share the same list reference, since a new list object is created sandwich = ['bread', 'bread']on every call , not just once on definition.addIngredient()addIngredient()

Mutable data types include lists, dictionaries, sets, and classobjects made of statements. Do not put objects of these types as default parameters defin statements.

Don't use string concatenation to build strings

In Python, strings are immutable objects. This means that string values ​​cannot be changed, and any code that appears to modify a string is actually creating a new string object. For example, each of the following operations changes spamthe contents of a variable, not by changing the string value, but by replacing it with a new string value with a new identity:

>>> spam = 'Hello'
>>> id(spam), spam
(38330864, 'Hello')
>>> spam = spam + ' world!'
>>> id(spam), spam
(38329712, 'Hello world!')
>>> spam = spam.upper()
>>> id(spam), spam
(38329648, 'HELLO WORLD!')
>>> spam = 'Hi'
>>> id(spam), spam
(38395568, 'Hi')
>>> spam = f'{
      
      spam} world!'
>>> id(spam), spam
(38330864, 'Hi world!')

Note that id(spam)each call to returns a different identity, because spamthe string object in is not changed: it is replaced by an entirely new string object with a different identity. Creating a new string by using F-strings, format()string methods, or format specifiers also creates new string objects, just like string concatenation. %sNormally, this technical detail doesn't matter. Python is a high-level language that handles many of these details for you, so you can focus on creating your programs.

But building strings through lots of string concatenation slows down the program. Each iteration of the loop creates a new string object and discards the old one: in code, this looks like a concatenation in an forOR whileloop, like this:

>>> finalString = ''
>>> for i in range(100000):
...    finalString += 'spam '
...
>>> finalString
spam spam spam spam spam spam spam spam spam spam spam spam `--snip--`

Because finalString += 'spam 'the loop occurs 100,000 times, Python performs 100,000 string concatenations. finalStringThe CPU has to create these intermediate string values ​​by concatenating the current 'spam 'sums, putting them in memory, and then discarding them almost immediately on the next iteration. This is a waste since we only care about the last string.

The Pythonic way of building strings is to append the smaller strings to a list, then concatenate the lists into a single string. This method still creates 100,000 String objects, but it only performs the string concatenation once, when it is called join(). For example, the following code produces the equivalent finalString, but without the intermediate string concatenation:

>>> finalString = []
>>> for i in range(100000):
...    finalString.append('spam ')
...
>>> finalString = ''.join(finalString)
>>> finalString
spam spam spam spam spam spam spam spam spam spam spam spam --snip--

When I measured these two pieces of code running on my machine, the list append method was 10 times faster than the string concatenation method. (Chapter 13 describes how to measure how fast your program runs.) The more iterations the loop goes through, the bigger the difference. But when you range(100000)change to range(100)instead, the speed difference is negligible, although joins are still slower than list appends. format()You don't need to avoid string concatenation, F-strings, string methods, or %sformat specifiers too much in any case . The speed increases significantly only when doing a lot of string concatenation.

Python frees you from having to think about many low-level details. This allows programmers to write software quickly, and as mentioned earlier, programmer time is more valuable than CPU time. But there are cases where it pays to understand the details, like the difference between immutable strings and mutable lists, so you don't get bogged down in things like building strings by concatenation.

Don't expect sort()to sort alphabetically

Understanding sorting algorithms, which are algorithms that systematically arrange values ​​in some predetermined order, is an important foundation of computer science education. But this is not a computer science book; we don't need to know these algorithms, because we can call Python's sort()methods directly. However, you'll notice sort()some weird sorting behavior, putting uppercase before Zlowercase a:

>>> letters = ['z', 'A', 'a', 'Z']
>>> letters.sort()
>>> letters
['A', 'Z', 'a', 'z']

American Standard Code for Information Interchange (ASCII, pronounced "ask-ee") is a mapping between numeric codes (called code points or plain codes ) and text characters. sort()Methods use ASCII code point sorting (a general term meaning ordinal sorting) rather than alphabetical sorting. In the ASCII system, Athis is represented by code point 65, Bby 66, and so on up to Z90. Lowercase ais represented by code point 97, b98, and so on until z122. When sorting by ASCII, uppercase Z(code point 90) comes before lowercase a(code point 97).

Although ASCII was nearly universal in Western computing before and throughout the 1990s, it was only an American standard: the dollar sign had a code point, (code point 36), but the pound sign had no code $point . ASCII has largely been replaced by Unicode, which contains all of ASCII's code points and over 100,000 others.

By passing a character to ord()a function, the code point or ordinal of the character can be obtained. You can in turn pass an ordinal integer to chr()the function, which returns a string. For example, enter the following in an interactive shell:

>>> ord('a')
97
>>> chr(97)
'a'

If you want to sort alphabetically, pass str.lowerthe method as keyan argument. This will sort the list as if lower()a string method had been called on the values:

>>> letters = ['z', 'A', 'a', 'Z']
>>> letters.sort(key=str.lower)
>>> letters
['A', 'a', 'z', 'Z']

Note that the actual strings in the list are not converted to lowercase; they are just sorted as-is. Ned Batchelder provides more information on Unicode and code points in his talk "Practical Unicode, or, How Do I Stop Being Painful? nedbatchelder.com/text/unipain.html"

By the way, sort()the sorting algorithm used by Python's method is Timsort, designed by Tim Peters, the core Python developer and author of "Zen of Python". It is a hybrid of the merge sort and insertion sort algorithms, en.wikipedia.org/wiki/Timsortdescribed here.

Don't Assume Floating Point Numbers Are Perfectly Accurate

Computers can only store numbers in the binary number system, ie, 1s and 0s. To represent the familiar decimal numbers, we need to 3.14translate a number like this into a series of binary 1s and 0s. Computers do this according to the IEEE 754 standard published by the Institute of Electrical and Electronics Engineers (IEEE, pronounced "eye-triple-ee"). For simplicity, these details are hidden from the programmer, allowing you to type numbers with a decimal point and ignoring the decimal-to-binary conversion process:

>>> 0.3
0.3

The IEEE 754 representation of a floating-point number does not always match a decimal number exactly, although the details of the specific case are beyond the scope of this book. A well-known example is 0.1:

>>> 0.1 + 0.1 + 0.1
0.30000000000000004
>>> 0.3 == (0.1 + 0.1 + 0.1)
False

This weird, slightly inaccurate sum is the result of rounding errors caused by the way computers represent and handle floating-point numbers. This isn't a Python trap; the IEEE 754 standard is a hardware standard implemented directly in the CPU's floating-point circuitry. C++, JavaScript, and any other language running on a CPU using IEEE 754 (actually every CPU in the world) will get the same result.

The IEEE 754 standard also cannot represent all 2 ** 53integer values ​​greater than , for technical reasons beyond the scope of this book. For example, 2 ** 53and 2 ** 53 + 1as floating point values, both round to 9007199254740992.0:

>>> float(2**53) == float(2**53) + 1
True

As long as you're using floating point data types, there's no way to account for these rounding errors. But don't worry. Unless you're writing software for a bank, a nuclear reactor, or a bank's nuclear reactor, the roundoff errors are small and probably not a significant problem for your program. You can usually resolve them by using integers with smaller denominations: for example, 133cents instead of 1.33dollars, or 200milliseconds instead of 0.2seconds. That way, 10 + 10 + 10it adds up to 30minutes or milliseconds, not 0.1 + 0.1 + 0.1dollars 0.30000000000000004or seconds.

But if you need precise precision, say for scientific or financial calculations, use Python's built-in decimalmodule, which is docs.python.org/3/library/decimal.htmldocumented in . Although they are slower, Decimalobjects are exact replacements for floating point values. For example, decimal.Decimal('0.1')create an object that represents the exact number 0.1 0.1without being imprecise like a floating point value.

0.1Passing a float value to decimal.Decimal()will create an object with the same imprecision as the float value Decimal, which is why the final Decimalobject isn't exactly Decimal('0.1'). Instead, pass a string of floating point values ​​to decimal.Decimal(). To illustrate this, enter the following into the interactive shell:

>>> import decimal
>>> d = decimal.Decimal(0.1)
>>> d
Decimal('0.1000000000000000055511151231257827021181583404541015625')
>>> d = decimal.Decimal('0.1')
>>> d
Decimal('0.1')
>>> d + d + d
Decimal('0.3')

Integers have no rounding errors, so passing to decimal.Decimal()is always safe. Enter the following in the interactive shell:

>>> 10 + d
Decimal('10.1')
>>> d * 3
Decimal('0.3')
>>> 1 - d
Decimal('0.9')
>>> d + 0.1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'decimal.Decimal' and 'float'

But Decimalobjects don't have infinite precision; they just have a predictable, established level of precision. For example, consider the following operations:

>>> import decimal
>>> d = decimal.Decimal(1) / 3
>>> d
Decimal('0.3333333333333333333333333333')
>>> d * 3
Decimal('0.9999999999999999999999999999')
>>> (d * 3) == 1 # d is not exactly 1/3
False

The expression decimal.Decimal(1) / 3evaluates to something other than one third. But by default it will be accurate to 28 significant figures. You can find out how many significant figures the module uses by accessing decimal.getcontext().precthe property . decimal(Technically, a property precof getcontext()the returned Contextobject, but conveniently on one line.) You can change this property so that all objects created afterwards Decimaluse this new level of precision. The following interactive shell example reduces the precision from the original 28 significant digits to 2:

>>> import decimal
>>> decimal.getcontext().prec
28
>>> decimal.getcontext().prec = 2
>>> decimal.Decimal(1) / 3
Decimal('0.33')

decimalModules give you fine-grained control over how numbers interact. The module is fully documenteddecimal at https://docs.python.org/3/library/decimal.html .

Do not chain inequality operators!=

Chained comparison operators like 18 < age < 35this or six = halfDozen = 6chained assignment operators like this are convenient shortcuts for (18 < age) and (age < 35)and , respectively.six = 6; halfDozen = 6

But don't chain !=comparison operators. You might think that the code below checks that all three variables have different values ​​from each other, because the expression below evaluates to True:

>>> a = 'cat'
>>> b = 'dog'
>>> c = 'moose'
>>> a != b != c
True

But this chain is actually equivalent (a != b) and (b != c). This means it acan still be cthe same as and a != b != cthe expression is still True:

>>> a = 'cat'
>>> b = 'dog'
>>> c = 'cat'
>>> a != b != c
True

The bug is subtle and the code is misleading, so chaining !=operators is best avoided.

Don't forget commas in single-item tuples

When writing tuple values ​​in code, keep in mind that a trailing comma is still required even if the tuple contains only one item. While value (42, )is a 42tuple containing integers, values (42)​​are just integers 42. (42)The parentheses in are similar to (20 + 1) * 2those used in expressions that evaluate to integer values 42. Forgetting the comma leads to this:

>>> spam = ('cat', 'dog', 'moose')
>>> spam[0]
'cat'
>>> spam = ('cat')
>>> spam[0] # 1
'c'
>>> spam = ('cat', ) # 2
>>> spam[0]
'cat'

Without the comma, ('cat')evaluates to the string value, which is why spam[0]evaluates to the first character of the string, 'c'1 . To recognize parentheses as a tuple value, a trailing comma is required. In Python, commas form more tuples than parentheses.

Summarize

Miscommunication occurs in every language, even in programming languages. Python has several pitfalls for the unwary. Even if they occur rarely, it's good to know about them so you can quickly identify and debug problems they might cause.

Although it is possible to add or remove entries from a list while iterating over it, this is a potential source of bugs. It is safer to iterate over a copy of the list, and then make modifications to the original list. When you copy a list (or any other mutable object), remember that the assignment statement only copies the reference to the object, not the actual object. You can use copy.deepcopy()a function to copy an object (and a copy of any object it references).

You should not defuse mutable objects as default parameters in statements, because they are defcreated when the statement is run, not every time the function is called. A better idea is to set default parameters None, then add code to check Noneand create a mutable object when the function is called.

A subtle problem is +concatenating several smaller strings with operators in a loop. For a small number of iterations, this syntax is fine. But behind the scenes, Python is constantly creating and destroying string objects on each iteration. A better approach is to append the smaller strings to a list, then call join()the operator to create the final string.

sort()Methods are sorted by numeric code points, which is different from alphabetical order: uppercase ones come before Zlowercase ones a.

Floating point numbers have slight rounding errors as a side effect of the way they represent numbers. For most programs, this doesn't matter. But if this affects your program, you can use Python's decimalmodules.

Never !=string operators together, as 'cat' != 'dog' != 'cat'expressions like this will confusingly evaluate to True.

Although this chapter describes the Python pitfalls you're most likely to encounter, they don't occur very often in most real-world code. Python does a great job of reducing the number of surprises that can arise in your programs. In the next chapter, we'll cover some even rarer and downright weird traps. It's almost impossible to encounter these strange Python languages ​​without looking for them, but it's interesting to explore why they exist.

Guess you like

Origin blog.csdn.net/wizardforcel/article/details/130030457