Python Advanced Guide (Easy Advanced Programming): 5. Discover Code Smell

Original: http://inventwithpython.com/beyond/chapter5.html

The code that causes a program to crash is obviously wrong, but crashes are not the only means of discovering program problems. Other signs may point to more subtle bugs or unreadable code in the program. Just as the smell of gas can indicate a gas leak or the smell of smoke can indicate a fire, code smells are source code patterns that indicate potential bugs. A code smell doesn't necessarily mean that there is a problem, but it does mean that you should pay attention to your program.

This chapter lists several common code smells. Anticipating a bug takes far less time and effort than encountering, understanding, and fixing a bug later. Every programmer has a story of spending hours debugging only to discover that the fix only required changing one line of code. For this reason, even a small potential error should give you pause and remind you to double check and rule out potential problems with your code.

Of course, code smells don't have to be a problem. Ultimately, it's up to you whether to fix or ignore the code smell.

Duplicate code

The most common code smell is duplication of code . Duplicate code is what you do by copying and pasting some other code into your program. For example, this short program contains repetitive code. Note that it asks the user how they feel three times:

print('Good morning!')
print('How are you feeling?')
feeling = input()
print('I am happy to hear that you are feeling ' + feeling + '.')
print('Good afternoon!')
print('How are you feeling?')
feeling = input()
print('I am happy to hear that you are feeling ' + feeling + '.')
print('Good evening!')
print('How are you feeling?')
feeling = input()
print('I am happy to hear that you are feeling ' + feeling + '.')

Duplicated code is a problem because it makes changing the code difficult; changes you make to one copy of the duplicated code must apply to every copy in the program. If you forget to make a change somewhere, or if you make different changes on different copies, your program will most likely end up with errors.

The solution to duplicating code is to deduplicate it ; that is, to make it appear once in the program by placing the code inside a function or loop. In the example below, I've moved the repeated code into a function, and called that function repeatedly:

def askFeeling():
    print('How are you feeling?')
    feeling = input()
    print('I am happy to hear that you are feeling ' + feeling + '.')

print('Good morning!')
askFeeling()
print('Good afternoon!')
askFeeling()
print('Good evening!')
askFeeling()

In the next example, I moved the repeated code into a loop:

for timeOfDay in ['morning', 'afternoon', 'evening']:
    print('Good ' + timeOfDay + '!')
    print('How are you feeling?')
    feeling = input()
    print('I am happy to hear that you are feeling ' + feeling + '.')

You can also combine the two techniques, using functions and loops:

def askFeeling(timeOfDay):
    print('Good ' + timeOfDay + '!')
    print('How are you feeling?')
    feeling = input()
    print('I am happy to hear that you are feeling ' + feeling + '.')

for timeOfDay in ['morning', 'afternoon', 'evening']:
    askFeeling(timeOfDay)

Note that generating a "Good morning/Good afternoon/Good evening!" message is similar but not identical. In the third improvement of the program, I parameterized the code to eliminate duplication of the same parts. Also, timeOfDayparameters and timeOfDayloop variables replace different parts. Now that I've deduplicated this code by removing the extra copy, I only need to make any necessary changes in one place.

As with all code smells, avoiding duplicate code is not a hard and fast rule that must always be followed. In general, the longer the repeated code segment, or the more duplicate copies appear in the program, the more necessary it is to deduplicate. I don't mind copy-pasting the code once or even twice. However, when there are three or four copies of my program, I usually consider deduplication of the code.

Sometimes code isn't worth repeating. Compare the first code sample in this section with the latest code sample. While the repeated code is longer, it's simple and straightforward. The deduplicated example does the same thing, but involves a loop, a new timeOfDayloop variable, and a new function with an timeOfDayargument named

Duplicate code is a code smell because it makes your code harder to change consistently. If you have several repeated codes in your program, the solution is to put the code in a function or loop so that it appears only once.

magic number

It's no surprise that programming involves numbers. But some numbers that appear in your source code may confuse other programmers (or confuse you weeks after writing them). For example, consider the numbers in the following line 604800:

expiration = time.time() + 604800

time.time()The function returns an integer representing the current time. We can assume that expirationthe variable will represent some time in 604,800 seconds. But 604800mysterious: what's the point of this deadline? Comments help to clarify:

expiration = time.time() + 604800  # Expire in one week.

This is a good solution, but an even better solution is to replace these "magic" numbers with constants. Constants are variables whose names are written in capital letters to indicate that their value should not change after the initial assignment. Usually, constants are defined as global variables at the top of the source code file:

# Set up constants for different time amounts:
SECONDS_PER_MINUTE = 60
SECONDS_PER_HOUR   = 60 * SECONDS_PER_MINUTE
SECONDS_PER_DAY    = 24 * SECONDS_PER_HOUR
SECONDS_PER_WEEK   = 7  * SECONDS_PER_DAY

`--snip--`

expiration = time.time() + SECONDS_PER_WEEK  # Expire in one week.

Even if the magic numbers are the same, you should use different constants for the magic numbers for different purposes. For example, there are 52 cards in a deck of cards and there are 52 weeks in a year. But if you have these two quantities in your program, you should do something like this:

NUM_CARDS_IN_DECK = 52
NUM_WEEKS_IN_YEAR = 52

print('This deck contains', NUM_CARDS_IN_DECK, 'cards.')
print('The 2-year contract lasts for', 2 * NUM_WEEKS_IN_YEAR, 'weeks.')

When you run this code, the output will look like this:

This deck contains 52 cards.
The 2-year contract lasts for 104 weeks.

Using separate constants allows you to change them independently in the future. Note that constant variables should not change value while the program is running. But that doesn't mean programmers can never update them in source code. For example, if a future version of your code contains a wild card, you can weekschange cardsthe constant without affecting it:

NUM_CARDS_IN_DECK = 53
NUM_WEEKS_IN_YEAR = 52

The term magic number also applies to non-numeric values. For example, you can use string values as constants. Consider the following program, which asks the user to enter a direction and displays a warning if the direction is north. A 'nrth'typo prevents the program from displaying a warning:

while True:
    print('Set solar panel direction:')
    direction = input().lower()
    if direction in ('north', 'south', 'east', 'west'):
        break

print('Solar panel heading set to:', direction)
if direction == 'nrth': # 1
    print('Warning: Facing north is inefficient for this panel.')

This bug is hard to spot: 'nrth'a typo in a string, because this program is still syntactically correct Python code. The program doesn't crash, and there are no warning messages that are easy to ignore. But if we use constants and make the same mistake, this error will crash the program because Python will notice that a NRTHconstant does not exist:

# Set up constants for each cardinal direction:
NORTH = 'north'
SOUTH = 'south'
EAST = 'east'
WEST = 'west'

while True:
    print('Set solar panel direction:')
    direction = input().lower()
    if direction in (NORTH, SOUTH, EAST, WEST):
        break

print('Solar panel heading set to:', direction)
if direction == NRTH: # 1
    print('Warning: Facing north is inefficient for this panel.')

NRTHThe exception thrown by a line of code with a typo makes NameErrorthe error immediately apparent when you run the program:

Set solar panel direction:
west
Solar panel heading set to: west
Traceback (most recent call last):
  File "panelset.py", line 14, in <module>
    if direction == NRTH:
NameError: name 'NRTH' is not defined

Magic numbers are a code smell because they don't communicate their purpose, making your code less readable, harder to update, and prone to undetectable typos. The solution is to use constant variables.

Commented out code and zombie code

Commenting out the code so it doesn't run is fine as a temporary measure. You may want to skip some lines to test other functionality, and comment them out so you can add them back later. But if the commented out code is still there, it's a complete mystery why it was removed and under what circumstances it might be needed again. Take a look at the example below:

doSomething()
#doAnotherThing()
doSomeImportantTask()
doAnotherThing()

This code raises many unanswered questions: why doAnotherThing()is it commented out? Will we still include it? Why is the second call doAnotherThing()not commented out? Why was it called twice initially doAnotherThing(), or was it doSomeImportantTask()commented out after one? Is there a reason we don't remove commented out code? There are no ready-made answers to these questions.

Zombie code is code that is unreachable or logically never able to run. For example, code inside a function but after a statement, code in a statement block returnwith an always Falsecondition , or code in a function that is never called is zombie code. ifTo see this in action, enter the following in an interactive shell:

>>> import random
>>> def coinFlip():
...    if random.randint(0, 1):
...        return 'Heads!'
...    else:
...        return 'Tails!'
...    return 'The coin landed on its edge!'
...
>>> print(coinFlip())
Tails!

return 'The coin landed on its edge!'This line is zombie code because the code in the ifand block returns before execution reaches that line. elseZombie code is misleading because the programmer reading it thinks it is a valid part of the program when in fact it is the same as the commented out code.

Stubs are an exception to these code smells. These are placeholders for future code, such as functions or classes that have not yet been implemented. In place of real code, the stub contains a passstatement that does nothing. (aka no-op ) statements only pass, so you can create stubs where the language syntax requires some code:

>>> def exampleFunction():
...    pass
...

When this function is called, it does nothing. Instead, it's just meant to indicate that code will eventually be added.

Alternatively, to avoid accidentally calling an unimplemented function, you can raise NotImplementedErrorstub it with a single statement. This will immediately indicate that the function is not ready to be called:

>>> def exampleFunction():
...    raise NotImplementedError
...
>>> exampleFunction()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in exampleFunction
NotImplementedError

Throwing a will warn you whenever your program accidentally calls a stub function or method NotImplementedError.

Commented out code and zombie code are both code smells because they mislead the programmer into thinking that the code is an executable part of the program. Instead, delete them and use a version control system, such as Git or Subversion, to track changes. Version control is covered in Chapter 12. With version control, you can remove code from your program and easily add it back later if needed.

print debug

Print debugging is the practice of placing temporary calls in a program print()to display variable values, and then rerunning the program. The process generally follows these steps:

Notice a bug in your program.
Use print()to view some variable values.
Rerun the program.
Added some more print(), because the previous ones didn't show enough information.
Rerun the program.
Repeat the previous two steps several times before finally figuring out the error.
Rerun the program.
Realize you forgot to remove some print(), and remove them.

Printing debugging seems quick and easy. But it is often necessary to run the program many times before the information needed to fix the bug is displayed. The solution is to use debug or set a log file for the program. By using debug, you can run code one line at a time and inspect any variable. Using debug may seem slower than simply inserting a print()call, but it saves you time in the long run.

A log file can record a lot of information about your program so you can compare a run of it with previous runs. In Python, a built-in loggingmodule provides the ability to easily create log files with just three lines of code:

import logging
logging.basicConfig(filename='log_filename.txt', level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')
logging.debug('This is a log message.')

loggingAfter importing the module and setting up its basic configuration, you can call to logging.debug()write the information to a text file instead of print()displaying it on the screen using . Different from printing and debugging, calling logging.debug()can clearly see what output is debugging information and what output is the result of normal operation of the program. You can find out more about debugging in Chapter 11 of (Automate the Boring Stuff with Python), which you can autbor.com/2e/c11read online.

variables with numeric suffix

When writing a program, you may need multiple variables that store the same kind of data. In these cases, you might try to reuse the variable name by appending a numeric suffix to it. For example, if you were working on a registration form that required the user to enter their password twice to prevent typos, you could store those password strings in variables called password1and . password2These numeric suffixes don't do a good job of describing what the variables contain or how they differ. Nor do they indicate how many of these variables there are: is there one password3or only one password4? Try creating different names instead of lazily adding numeric suffixes. For this password example, better names would be passwordand confirm_password.

Let's look at another example: If you have a function that deals with start and end coordinates, you might have parameters x1, , y1, x2and y2. But names with numeric suffixes don't convey as much information as names start_x, , start_y, end_xand . It is also clearer that the and variables are related to each other compared end_yto x1and .y1start_xstart_y

If your numbers have suffixes of more than 2, you probably want to use a list or set data structure to store your data as a set. For example, you can store the values of pet1Name, , pet2Name, pet3Nameetc. in a petNameslist called .

This code smell doesn't apply to every variable that only ends in a number. For example, enableIPv6it's perfectly fine to have a variable named , since "6" is part of the "IPv6" distinguished name, not a numeric suffix. However, if you use numeric suffixes for a series of variables, consider replacing them with a data structure such as a list or dictionary.

There should only be functions or modules in the class

Programmers using languages such as Java are accustomed to creating classes to organize their program code. For example, let's look at this example Diceclass, which has a roll()method:

>>> import random
>>> class Dice:
...    def __init__(self, sides=6):
...        self.sides = sides
...    def roll(self):
...        return random.randint(1, self.sides)
...
>>> d = Dice()
>>> print('You rolled a', d.roll())
You rolled a 1

This may seem like well-organized code, but consider what we actually need: a random number between 1 and 6. We can replace the entire class with a simple function call:

>>> print('You rolled a', random.randint(1, 6))
You rolled a 6

Compared to other languages, Python uses a haphazard approach to organizing code because its code does not need to exist in classes or other boilerplate structures. If you find that objects are created just to make a single function call, or if you write classes that contain only static methods, these code smells indicate that you might be better suited to writing functions.

In Python, we use modules rather than classes to group functions together. Since classes have to be in a module anyway, putting code in classes just adds an unnecessary layer of organization to the code. Chapters 15 through 17 discuss these object-oriented design principles in more detail. Jack Diederich's PyCon 2012 talk "Stop Writing Classes" covers other possible ways to complicate Python code.

Understanding nested lists

Lists are a concise way to express complex numeric columns. For example, to create a list of digit strings for the numbers 0 through 100, excluding all multiples of 5, usually requires a forloop:

>>> spam = []
>>> for number in range(100):
...    if number % 5 != 0:
...        spam.append(str(number))
...
>>> spam
['1', '2', '3', '4', '6', '7', '8', '9', '11', '12', '13', '14', '16', '17',
`--snip--`
'86', '87', '88', '89', '91', '92', '93', '94', '96', '97', '98', '99']

Alternatively, you can use the list comprehension syntax to create the same list in one line of code:

>>> spam = [str(number) for number in range(100) if number % 5 != 0]
>>> spam
['1', '2', '3', '4', '6', '7', '8', '9', '11', '12', '13', '14', '16', '17',
`--snip--`
'86', '87', '88', '89', '91', '92', '93', '94', '96', '97', '98', '99']

Python can also comprehend lists with sets and dictionaries:

>>> spam = {
    
    str(number) for number in range(100) if number % 5 != 0} # 1
>>> spam
{
    
    '39', '31', '96', '76', '91', '11', '71', '24', '2', '1', '22', '14', '62',
`--snip--`
'4', '57', '49', '51', '9', '63', '78', '93', '6', '86', '92', '64', '37'}
>>> spam = {
    
    str(number): number for number in range(100) if number % 5 != 0} # 2
>>> spam
{
    
    '1': 1, '2': 2, '3': 3, '4': 4, '6': 6, '7': 7, '8': 8, '9': 9, '11': 11,
`--snip--`
'92': 92, '93': 93, '94': 94, '96': 96, '97': 97, '98': 98, '99': 99}

Set definitions use curly braces instead of square brackets, yielding a set value. Dictionaries produce a dictionary value and use colons to separate keys and values in a list.

These comprehensions are concise and can make your code more readable. Note, however, that a comprehension produces a list, set, or dictionary from an iterable object (in this case, the object range(100)returned by the call ). rangeLists, sets, and dictionaries are all iterable objects, which means you can nest lists within lists, as in the following example:

>>> nestedIntList = [[0, 1, 2, 3], [4], [5, 6], [7, 8, 9]]
>>> nestedStrList = [[str(i) for i in sublist] for sublist in nestedIntList]
>>> nestedStrList
[['0', '1', '2', '3'], ['4'], ['5', '6'], ['7', '8', '9']]

But nested list comprehensions (or nested set and dictionary comprehensions) cram a lot of complexity into a small amount of code, making your code hard to read. It's better to expand the list comprehension into one or more forloops:

>>> nestedIntList = [[0, 1, 2, 3], [4], [5, 6], [7, 8, 9]]
>>> nestedStrList = []
>>> for sublist in nestedIntList:
...    nestedStrList.append([str(i) for i in sublist])
...
>>> nestedStrList
[['0', '1', '2', '3'], ['4'], ['5', '6'], ['7', '8', '9']]

Comprehensions can also contain multiple forexpressions, although this also tends to produce unreadable code. For example, the following list comprehension produces a flat list from nested lists:

>>> nestedList = [[0, 1, 2, 3], [4], [5, 6], [7, 8, 9]]
>>> flatList = [num for sublist in nestedList for num in sublist]
>>> flatList
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

This list comprehension contains two forexpressions, but it is difficult for even experienced Python developers to understand. The expanded form uses two forloops, creating the same flat list, but easier to read:

>>> nestedList = [[0, 1, 2, 3], [4], [5, 6], [7, 8, 9]]
>>> flatList = []
>>> for sublist in nestedList:
...    for num in sublist:
...        flatList.append(num)
...
>>> flatList
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Lists are syntactically concise and can produce concise code, but don't go to the extreme of nesting them together.

empty exception catch block

Catching exceptions is one of the main ways to ensure that a program continues to run even when something goes wrong. When an exception occurs, but there is no exceptblock to handle it, the Python program stops running immediately and crashes. This may result in loss of unsaved work or in a half-finished state of the file.

exceptYou can prevent crashes by providing a block containing code to handle errors . But it's hard to decide how to handle an error, and the programmer might simply leave the block empty with a passstatement except. For example, in the following code, we use passcreate a exceptblock, which does nothing:

>>> try:
...    num = input('Enter a number: ')
...    num = int(num)
... except ValueError:
...    pass
...
Enter a number: forty two
>>> num
'forty two'

This code doesn't crash when 'forty two'passed to int(), because what is int()raised is handled ValueErrorby the statement. exceptBut doing nothing about errors can be worse than crashing. Programs crash so they don't continue to run with bad data or in an incomplete state, which can lead to worse bugs later. Our code doesn't crash when non-numeric characters are entered. But now numthe variable contains a string instead of an integer, which can numcause problems when using the variable. Our exceptstatements are not so much about handling errors as they are about hiding them.

Handling exceptions with bad error messages is another code smell. Check out this example:

>>> try:
...    num = input('Enter a number: ')
...    num = int(num)
... except ValueError:
...    print('An incorrect value was passed to int()')
...
Enter a number: forty two
An incorrect value was passed to int()

It's good that this code doesn't crash, but it doesn't give the user enough information to know how to fix the problem. Error messages are for users, not programmers. Not only does this error message contain technical details that the user cannot understand, such as int()a reference to a function, but it doesn't tell the user how to fix the problem. Error messages should explain what happened, and what the user should do.

It's easier for a programmer to quickly write a single, unhelpful description than detailed steps a user can take to solve a problem. But remember, if your program cannot handle all possible exceptions, then it is an incomplete program.

code smell misunderstanding

Some code smells aren't really code smells at all. Programming is full of lesser known bad advice that is taken out of context or persists after they have lost their usefulness. I blame those technical book authors who are good teachers.

You've probably been told that some of these practices are code smells, but most of them are good. I call them misinterpreted code smells : they are warnings that you can and should ignore. Let's take a look at a few of them.

Misconception: There should be only one `return`statement at the end of a function

The "one in, one out" idea comes from misunderstood advice from the days of assembly and FORTRAN language programming. These languages allow you to step into a subroutine (a function-like construct) at any point, including in the middle of it, making it difficult to debug which part of the subroutine was executed. Functions don't have this problem (execution always starts at the beginning of the function). But the advice stuck around and became "Functions and methods should have only one returnstatement, which should be at the end of the function or method."

Trying to implement a single statement for each function or method returnoften requires a convoluted series of if-elsestatements, which is returnmore confusing than having multiple statements. returnIt is fine to have multiple statements in a function or method .

Myth: Functions should have at most one `try`statement

"Functions and methods should do one thing" is usually good advice. But it's going too far to interpret this to mean that exception handling should be done in a separate function. For example, let's look at a function that indicates whether the file we want to delete no longer exists:

>>> import os
>>> def deleteWithConfirmation(filename):
...    try:
...        if (input('Delete ' + filename + ', are you sure? Y/N') == 'Y'):
...            os.unlink(filename)
...    except FileNotFoundError:
...        print('That file already did not exist.')
...

Proponents of this snippet argue that since functions should always do one thing, and error handling is one, we should split this function into two functions. They argue that if you use a try-exceptstatement, it should be the first statement in the function and encapsulate all the code of the function, like so:

>>> import os
>>> def handleErrorForDeleteWithConfirmation(filename):
...    try:
...        _deleteWithConfirmation(filename)
...    except FileNotFoundError:
...        print('That file already did not exist.')
...
>>> def _deleteWithConfirmation(filename):
...    if (input('Delete ' + filename + ', are you sure? Y/N') == 'Y'):
...        os.unlink(filename)
...

This is unnecessarily complex code. _deleteWithConfirmation()Functions are now marked _private with an underscore prefix to indicate that it should not be called directly, only handleErrorForDeleteWithConfirmation()indirectly via call. This new function has an awkward name because we're calling it an intent to delete a file, not an error that handles a deleted file.

Your functions should be small and simple, but that doesn't mean they should always be limited to doing "one thing" (however you define it). It 's okay if your function has more than one try-exceptstatement, and those statements don't contain all of the function's code.

Misconception: Flag parameters are bad

Boolean parameters to a function or method call are sometimes called flag parameters . In programming, a flag is a value representing a binary setting, such as "enable" or "disable", which is usually represented by a boolean. We can describe these settings as enabled (ie True) or disabled (ie False).

The erroneous idea that flag parameters to function calls are bad is based on the assertion that a function does two completely different things depending on the value of the flag, as in the following example:

def someFunction(flagArgument):
    if flagArgument:
        # Run some code...
    else:
        # Run some completely different code...

In fact, if your function looks like this, you should create two separate functions instead of having a single parameter to decide which half of the function's code to run. But most functions that take flag parameters don't do this. For example, you can pass a Boolean value for sorted()a function's reversekeyword argument to determine the sort order. Splitting the function into two functions named sorted()and reverseSorted()doesn't improve the code (while also increasing the amount of documentation required). So the idea that flag parameters are always bad is a misconception of a code smell.

Myth: Global variables are bad

Functions and methods are like mini-programs within a program: they contain code, including local variables that are forgotten when the function returns. This is similar to the case where variables are forgotten after program termination. Functions are independent: their code either executes correctly or with errors, depending on the arguments passed when calling them.

But functions and methods using global variables lose some useful isolation. Every global variable you use in a function actually becomes another input to the function, like a parameter. More parameters means more complexity, which in turn means higher likelihood of errors. If a bug occurs in a function due to a wrong value in a global variable, that wrong value could be set anywhere in the program. To search for a possible cause of this error value, you can't just analyze the code in the function or the line of code that calls the function; you have to look at the code of the entire program. Therefore, you should limit the use of global variables.

partyPlanner.pyFor example, let's look at a function in an imaginary program calculateSlicesPerGuest()that is several thousand lines long. I've included line numbers to give you an idea of the size of the program:

1504\. def calculateSlicesPerGuest(numberOfCakeSlices):
1505\.     global numberOfPartyGuests
1506\.     return numberOfCakeSlices / numberOfPartyGuests

Suppose when we run this program, we encounter the following exception:

Traceback (most recent call last):
  File "partyPlanner.py", line 1898, in <module>
    print(calculateSlicesPerGuest(42))
  File "partyPlanner.py", line 1506, in calculateSlicesPerGuest
    return numberOfCakeSlices / numberOfPartyGuests
ZeroDivisionError: division by zero

The program has a return numberOfCakeSlices / numberOfPartyGuestsdivision by zero error caused by the line. The variable must numberOfPartyGuestsbe set to 0cause this, but numberOfPartyGuestswhere does it get this value from? Because it's a global variable, it could happen anywhere in the thousands of lines of this program! From the traceback information, we know that calculateSlicesPerGuest()it is called at line 1898 of our fictional program. If we look at line 1898, we can find numberOfCakeSliceswhat parameter was passed for the parameter. But numberOfPartyGuestsglobal variables can be set at any time before the function call.

Note that global constants are not considered bad programming practice. Because their value never changes, they don't introduce complexity to your code like other global variables. When programmers say "global variables are bad", they don't mean constant variables.

Global variables increase the effort of debugging to find where values that cause exceptions may be set. This makes heavy use of global variables a bad idea. But thinking that all global variables are bad is a theorem of a code smell. Global variables are useful in smaller programs or when keeping track of settings that apply to an entire program. If you can avoid global variables, that means you should avoid global variables. But "global variables are bad" is an oversimplified view.

Myth: Comments Are Unnecessary

Bad comments are indeed worse than no comments at all. Comments with outdated or misleading information create more trouble for programmers than better understanding. But this potential problem is sometimes used to declare that all annotations are bad. This view holds that every comment should be replaced with more readable code, so that programs should not have comments at all.

Comments are written in English (or whatever language the programmer speaks), which allows them to convey information in a way that variables, function and class names cannot. But writing concise and effective comments is hard. Comments, like code, require rewriting and multiple iterations to get right. We understand code as soon as we write it, so writing comments can seem like pointless extra work. As a result, programmers tend to accept that comments are unnecessary.

A more common experience is a program with too few or no comments than too many or misleading comments. Rejecting a comment is like saying, "It's only 99.999991% safe to fly across the Atlantic in a passenger plane, so I'm going to swim it."

Chapter 10 has more information on how to write effective comments.

Summarize

A code smell indicates that there may be a better way to write code. They don't necessarily ask for a change, but they should give you another look. The most common code smell is duplication of code, which can mean an opportunity to put code inside a function or loop. This ensures that future code changes only need to be made in one place. Other code smells include magic numbers, which are uninterpreted values in code that can be replaced by constants with descriptive names. Similarly, commented out code and zombie code are never run by the computer and may mislead programmers who later read the program code. If you need to add them back to your program later, it's best to remove them and rely on a source control system like Git.

print_debug uses print()a call to display debug information. Although this method of debugging is easy, relying on debugging and logging to diagnose errors is often faster in the long run.

Variables with numeric suffixes, such as x1, , x2, x3etc., are usually best replaced with a single variable containing a list. Unlike languages such as Java, in Python we use modules rather than classes to group functions together. Classes containing a single method or only static methods are a code smell and suggest that you should put your code in a module rather than a class. Although list expressions are a concise way to create list values, nested list comprehensions are often unreadable.

exceptAlso, any exception handled with an empty block is a code smell, you're just eliminating the error, not handling it. A short, cryptic error message is as useless to the user as no error message at all.

Along with these code smell theorems: programming advice that no longer works, or that, over time, has proven counterproductive. These include putting only one returnstatement or try-exceptblock per function, never using flag parameters or global variables, and deeming comments unnecessary.

Of course, like all programming advice, the code smells described in this chapter may or may not apply to your project or to your personal preferences. Best practice is not an objective measure. As you gain more experience, you'll come to different conclusions about what code is readable or reliable, but the advice in this chapter outlines what to consider.