7 tricks for python

We all know that writing Python code is easy at first, but as you add more libraries to your toolkit, your scripts can become verbose and cluttered with unnecessary lines of code. It may be able to cope with the work in the short term, but in the long run, the trouble is not small.

In this post, I will share with you 7 tips to make your data science with Python more concise. This covers things we do every day, such as modifying values ​​in Pandas data frames, concatenating strings, reading files, and more!

1. Use Lambda to modify values ​​in a Pandas data frame

Suppose we have the following dfdataframe:

data = [[1,2,3], [4,5,6], [7,8,9]]
df = pd.DataFrame(data, columns=[0,1,2])
IN[1]: print (df)
OUT[1]:    0  1  2
        0  1  2  3
        1  4  5  6
        2  7  8  9

Now for some reason you need 0to add 01the value on the number in the column. A common approach is to define a function to accomplish this task, and then use the apply function to modify the value of a column.

def add_numbers(x):
    return f'{x}01'
df[0] = df[0].apply(add_numbers)
IN[1]: print (df)
OUT[1]:     0   1   2
        0  101  2   3
        1  401  5   6
        2  701  8   9

It's not complicated, but it's impractical to create a function for each change in the data frame. That's where lambdas come in handy.

A lambda function is similar to a normal Python function, but it can be defined without a name, which makes it a nifty one-liner. The code used before can be reduced in the following way.

df[0] = df[0].apply(lambda x:f'{x}01')

Lambdas are useful when you don't know if you can access a series' properties to modify the data.

For example, the column 0contains letters and we want to capitalize them.

# 如果你知道.str的存在,你可以这样做
df[0] = df[0].str.title()
# 如果你不知道.str,你仍然可以用lambda大写
df[0] = df[0].apply(lambda x: x.title())

2. Use f-string to concatenate strings

String concatenation is a very common operation in Python, and it can be done in different ways. The most common way is to use +an operator; however, a problem with this operator is that we cannot add any delimiters between strings.

Of course, if you want to concatenate "Hello" and "World", a typical workaround is to add a whitespace separator (" ").

print("Hello" + " " + "World")

This does the job, but to write more readable code, we can replace it with an f-string.

IN[2]: print(f'{Hello} {World}')
OUT[2]: "Hello World"

In a basic example, this may seem unnecessary, but when it comes to concatenating multiple values ​​(as you'll see in tip #3), f-strings will save you from writing multiple times + " " +. I don't know how many times I had to write +operators in the past, but not now!

Other ways to concatenate strings are to use join()methods or format()functions, however f-string does a better job of string concatenation.

3. Iterate over multiple lists with the Zip() function

Have you ever wanted to loop over more than one list in Python? You can enumerate use to .

teams = ['Barcelona', 'Bayern Munich', 'Chelsea']
leagues = ['La Liga', 'Bundesliga', 'Premiere League']
for i, team in enumerate(teams):
    league = leagues[i]
    print(f'{team} plays in {league}')

However, this becomes impractical when you have two or more lists. A better way is to use a zip()function. zip()The function takes the iteration data, gathers them in a tuple, and returns it.

Let's add one more list and see zip()the power!

teams = ['Barcelona', 'Bayern Munich', 'Chelsea']
leagues = ['La Liga', 'Bundesliga', 'Premiere League']
countries = ['Spain', 'Germany', 'UK']
for team, league, country in zip(teams, leagues, countries):
    print(f'{team} plays in {league}. Country: {country}')

The output of the above code is:

Barcelona plays in La Liga. Country: Spain
Bayern Munich plays in Bundesliga. Country: Germany
Chelsea plays in Premiere League. Country: UK

Did you notice here that we used f-strings in this example? Code becomes more readable, no?

4. Use list comprehension

A common step in cleaning and processing data is to modify existing lists. For example, we have the following list that needs to be capitalized:

words = ['california', 'florida', 'texas']

The typical way to capitalize each element of the words list is to create a new capitalized list, do a for loop, use .title(), and append each modified value to the new list.

capitalized = []
for word in words:
    capitalized.append(word.title())

However, the Pythonic way to do this is to use a list comprehension. List comprehensions have an elegant way to make lists.

You can rewrite the above forloop with one line of code:

capitalized = [word.title() for word in words]

From this we can skip some steps in the first example and the result is the same.

5. Use the with statement on the file object

When working on a project, we often read and write to files. The most common way is to use a open()function to open a file, which creates a file object that we can manipulate, and then, as a common practice, we should use to close()close the file object.

f = open('dataset.txt', 'w')
f.write('new_data')
f.close()

It's easy to remember, but sometimes after hours of writing code, we might forget to f.close()close the ffile with. This is where withsentences come in handy. withstatement will automatically close the file object fin the form:

with open('dataset.txt', 'w') as f:
    f.write('new_data')

With this, we can keep the code short.

You don't need to use it to read CSV files as you can  pd.read_csv()easily with pandas, but it's still useful when reading other types of files. For example, it is often used when reading data from pickle files.

import pickle 
# 从pickle文件中读取数据集
with open(‘test’, ‘rb’) as input:
    data = pickle.load(input)

6. Stop using square brackets to get dictionary items, use .get() instead

For example, consider the following dictionary:

person = {'name': 'John', 'age': 20}

We can get name and age by person[name]and respectively. person[age]However, for some reason we want to get a key that doesn't exist, like "salary", running person[salary]raises a `KeyError'.

This is where the get() method comes in handy. The get() method returns the value of the specified key if the key is in the dictionary, but if no key is found, Python returns None. Thanks to this, your code won't break.

person = {'name': 'John', 'age': 20}
print('Name: ', person.get('name'))
print('Age: ', person.get('age'))
print('Salary: ', person.get('salary'))

The output is as follows:

Name:  John
Age:  20
Salary:  None

7. Multiple assignment

Have you ever wanted to reduce the number of lines of code used to create multiple variables, lists or dictionaries? Well, you can easily do this with multiple assignment.

# 原始操作
a = 1
b = 2
c = 3
# 替代操作
a, b, c = 1, 2, 3
# 代替在不同行中创建多个列表
data_1 = []
data_2 = []
data_3 = []
data_4 = []
# 可以在一行中创建它们的多重赋值
data_1, data_2, data_3, data_4 = [], [], [], []
# 或者使用列表理解法
data_1, data_2, data_3, data_4 = [[] for i in range(4)]

Here I would like to recommend the Python learning Q group I built by myself: 831804576. Everyone in the group is learning Python. If you want to learn or are learning Python, you are welcome to join. Everyone is a software development party and shares dry goods from time to time ( Only related to Python software development),
including a copy of the latest Python advanced materials and zero-based teaching in 2021 that I have compiled by myself. Welcome to the advanced middle and small partners who are interested in Python!
 

Guess you like

Origin blog.csdn.net/BYGFJ/article/details/124096587