Python Tutorial | List Comprehension & Dictionary Comprehension

Python teaching column aims to provide beginners with a systematic and comprehensive Python programming learning experience. Through step-by-step explanation of the basic language and programming logic of Python, combined with practical cases, novices can easily understand Python!
>>>Click here to view previous Python teaching content

Table of contents

Part1Foreword

Part2 list comprehension

1. Create a list with one line of code

2. Conditional statements in list comprehensions

Part3 Dictionary derivation

1. Create a dictionary with one line of code

2. Generate a new dictionary based on the old dictionary

3. Conditional statements in dictionary derivation

Part4Summary

Part5Python Tutorial


Part1Foreword

Lists and dictionaries are two commonly used data structures in Python. A list is an ordered sequence []in Different elements are separated by commas. Accessing elements in the list is achieved through indexing. A dictionary is an unordered collection of key-value pairs, {}in which . The elements are '键':'值'stored in the form of , and the values ​​of the dictionary need to be accessed through the keys of the dictionary.

In previous tutorials, we have introduced common methods for creating lists and dictionaries. For specific content, you can review these two articles:

The method in the above article is more suitable for creating lists or dictionaries with relatively simple contents that can be manually written into elements. If the elements that need to be created are more complex (for example, loops and conditional statements need to be used to generate), this traditional method will make The code looks redundant. The advantage of derivation is that it provides a concise method of creating lists (dictionaries), which allows us to complete operations such as looping and filtering in one line of code, greatly reducing the amount of code. In addition, another important reason for mastering derivation is that in addition to our own needs, we will inevitably read other people's code, so from this point of view, learning derivation is also necessary.

Below we give a simple usage scenario of list derivation: when performing text segmentation in text analysis, the segmentation results are saved in the list of names, and the stop AllWordswords are saved in Stopwordsthe list of names. Now we want to remove the segmentation Stop words and length 1 words in the results . In order to see the advantages of derivation more intuitively, we achieve this goal in two ways: using or not using derivation.

# 先定义相关变量,与是否使用推导式无关
#分词结果
AllWords = ['加入','企研','·','社科','大','数据','平台','会员',',','用','最','独家','的','数据',',','学','最','实用','的','Python',',','画','最酷','的','图','!']
#停用词
Stopwords = ['·',',', '用','最']

# 一、使用for循环
Words = []
for word in AllWords:
    if word not in Stopwords and len(word) > 1:
        Words.append(word)
# ['加入', '企研', '社科', '数据', '平台', '会员', '独家', '数据', '实用', 'Python', '最酷']

# 二、使用列表推导式
Words = [word for word in AllWords if word not in Stopwords and len(word) > 1]
# 结果同上

From the above results, we can see that compared to explicit loops, derivation expressions are more concise. At the same time, when processing large data sets, derivation expressions can also complete tasks faster. In this article, we will introduce in detail the most common list comprehensions and dictionary comprehensions among comprehensions.

This tutorial is based on pandas version 1.5.3.

All Python code in this article was written in the integrated development environment Visual Studio Code (VScode) using the interactive development environment Jupyter Notebook. Please use Jupyter Notebook to open the code shared in this article.

For more details, please click the link to view the Enterprise Research·Social Science Big Data Platform

Part2 list comprehension

There are many uses of derivation, and this article will not introduce them one by one here. We mainly introduce the use of derivation to generate simple lists/dictionaries and generate lists/dictionaries with conditional statements. Before formally introducing the use of derivation, we need to master its basic syntax.

list = [expression for member in iterable (if conditional)]

The above expression mainly includes expression, member, iterableand conditionalfour parts, among which:

  • expressionIt can be any legal expression that can return a value. This value is the element value in the created list.

  • iterableIt is an iterable object, and each loop will get a value, which ismember

  • conditionalIt is an optional conditional statement that allows list comprehensions to selectively retain values ​​that meet the requirements. It (if conditional)can also be placed expressionafterwards to generate list elements based on conditions.

We use one sentence to describe the list comprehension, which is to loop out the data value from the iterable object, process the value into an element according to the expression, and finally store it in the list. The final result of the list comprehension is to include all processed values. List of elements.

In Python, iterable refers to an iterable object, that is, an object whose elements can be obtained using loop traversal. Common iterable objects include strings, lists, tuples, dictionaries, sets, etc.

1. Create a list with one line of code

Now let's take a look at the usage of list comprehensions. From the above example, we can see that the steps of using a for loop to create a list can be summarized into the following three steps:

  1. Create an empty listobject

  2. Loop through an iterable object iterable, taking out one value each time

  3. Process the value taken out by the loop according to the expression, and then write the obtained value as an element in the list into the emptylist

How to rewrite the for loop into a list comprehension? In fact, it is not difficult. We only need to put the expression in the loop statement block at the front of the derivation, and then write the other parts of the loop in sequence. Also note that using list comprehensions you can directly create elements in the list without creating an empty list first, so there is no need to add a function when rewriting append(). Below we introduce an example of using list derivation to rewrite a for loop. Suppose now we need to obtain a large group of patent classification numbers (referring to the oblique classification number) of a certain patent (the patent involves multiple patent classification numbers, and different classification numbers are separated by ;intervals). The part before the bar), the code is as follows:

# 假设该专利分类号为:A01B02/00;A01B02/10;A01B20/20
FLH = 'A01B02/00;A01B02/10;A01B20/20'

# 一、使用 for 循环
FLH_LIST = []
for one_flh in FLH.split(';'):
    flh_zu = one_flh.split('/')[0]
    FLH_LIST.append(flh_zu)
FLH_LIST
# 输出结果:['A01B02', 'A01B02', 'A01B20']

# 二、使用列表推导式改写
FLH_LIST = [one_flh.split('/')[0] for one_flh in FLH.split(';')]
FLH_LIST
# 结果同上

In the above code, the iterable object (that is, the loop body of the for loop) is FLH.split(';')(a list), one_flhthen the values ​​are looped in this list; the expression of the loop block is one_flh.split('/')[0], which means to obtain each patent classification number When the left part of /is rewritten as a derivation, the expression is placed at the front of the derivation, so the value obtained in each loop is calculated by the expression to obtain the final value, which is saved in the list.

In this example, we only processed the classification number of one patent, but in reality we need to perform the same processing on tens of millions of patent data. At this time, we can encapsulate the derivation in a custom function, and then use the function apply()Or map()perform batch operations on data.

2. Conditional statements in list comprehensions

The replacement of the for loop introduced in the previous section to create a simple list is one of the most common uses of list comprehensions. This usage is suitable for processing basic data types such as numeric values ​​and strings, especially when a set of initial values ​​needs to be quickly generated. Very practical. In addition, list comprehensions have another common usage, which allows to filter out elements that meet conditions according to specific rules, making list comprehensions more flexible. To achieve this usage, you only need to use conditional statements in list comprehensions.

Now let's review the example from the preface:

#分词结果
AllWords = ['加入','企研','·','社科','大','数据','平台','会员',',','用','最','独家','的','数据',',','学','最','实用','的','Python',',','画','最酷','的','图','!']
Stopwords = ['·',',', '用','最']  #停用词

# 一、使用for循环
Words = []
for word in AllWords:
    if word not in Stopwords and len(word) > 1:
        Words.append(word)
Words
# ['加入', '企研', '社科', '数据', '平台', '会员', '独家', '数据', '实用', 'Python', '最酷']

# 二、使用列表推导式
Words = [word for word in AllWords if word not in Stopwords and len(word) > 1]
Words  # 结果同上

In this example, we use a conditional statement if word not in Stopwords and len(word) > 1. The function of this conditional statement is to filter the output value of the loop and only retain values ​​that are non-stop words and have a length greater than 1  . It should be noted that the conditional judgment statement can be any legal logical expression. When the conditional statement is long or the judgment condition is complex, the conditional judgment statement can be encapsulated in a function first, and then the function can be used as the conditional statement of the derivation. Now assume that in the above example, we need to filter the conditions for word segmentation: non-stop words, length greater than 1 and must be Chinese characters . The modified code is as follows:

AllWords = ['加入','企研','·','社科','大','数据','平台','会员',',','用','最','独家','的','数据',',','学','最','实用','的','Python',',','画','最酷','的','图','!']
Stopwords = ['·',',', '用','最']

# 定义用于筛选的条件函数
def condition(String):
    return String not in Stopwords and len(String) > 1 and bool(re.search("[\u4e00-\u9fa5]", String))
Words = [word for word in AllWords if condition(word)]
Words
# ['加入', '企研', '社科', '数据', '平台', '会员', '独家', '数据', '实用', '最酷']

You can see that in the above code, the conditional function encapsulating the conditional statement is written on the rightmost side of the derivation, which implements the function of filtering the loop output value according to the conditions we give. So when the conditional judgment statement is relatively short, we can write it directly in the derivation. When the statement is more complex, we can also directly write the function that encapsulates the conditional statement in the derivation. This is enough to reflect the use of lists Derivative flexibility.

So far, the conditional judgment statements we have involved in list derivation are only ifstatements. In fact, the structure of conditional judgments also includes include elsestatements if...else..., etc. (it is only a legal conditional judgment statement). For example, when we When the conditional judgment statement is if...else...in the form, it is necessary to determine the elements in the list according to different judgment conditions. In this case, there are also some changes in the writing of the list derivation. In order to facilitate understanding, let's take a look at a simple example. Suppose there is a set of numerical data. We want to take the root sign of positive numbers and the absolute value of negative numbers in this set of data . The code is as follows:

# 初始的数值型数据
OriginList = [2.3, -4, 6.6, 7.2, -3.7, -1.1]

List = [round(i**0.5, 2) if i > 0 else abs(i) for i in OriginList]
List
# [1.52, 4, 2.57, 2.68, 3.7, 1.1]

The conditional judgment statement in this example is different if i > 0 else abs(i)from ifthe situation where there are only statements. At this time, the conditional statement in the derivation is placed in front of the loop statement. This is because different selections need to be made based on the conditions during each loop iteration. expression . The meaning of this conditional statement is that when the loop output value is a positive number, it is expressioncalculated according to the calculation (in this case, it is the root sign, and two digits of precision are retained), otherwise it is  elsecalculated according to the following expression (in this case, it is the absolute value ).

Part3 Dictionary derivation

Dictionary comprehensions and list comprehensions are logically similar, and they can both be used in combination with loops and conditional statements. The difference is that dictionary comprehension returns a dictionary, and its basic syntax is slightly different from list comprehension:

dict = {key_expression:value_expression for key, value in iterable (if conditional)}

The sum in the syntax key_expressionis value_expressionthe expression of the key and value of the dictionary respectively. It is the same as the list comprehension. It iterableis an iterable object. key, valueTake out the key and value in each loop. Generally, use a function items()or zip()make each loop return a pair. Values ​​are used as keys and values ​​respectively; (if conditional)they are conditional judgment statements in dictionary derivation. This statement can not only be placed after the loop statement, but can also be placed after the expression key_expressionand value_expressionused to modify the return value of the loop . The syntax is as follows:

dict = {key_expression:value_expression1 (if conditional else value_expression2) for key, value in iterable}

The above conditional statement means: when ifthe condition of the statement is met, key, valuethe key-value pair formed by the expression calculation returned by the loop key_expression:value_expression1will be used as an element of the dictionary. Otherwise, the key-value pair formed will be used as key_expression:value_expression2an element of the dictionary.

1. Create a dictionary with one line of code

Without further ado, let’s illustrate the use of dictionary derivation through an example. Assume that we are currently performing text analysis. At this time, we need to segment the sentences and generate a dictionary of words and parts of speech . The code implemented using a for loop is as follows:

from jieba import posseg

Text = "开启全面建设社会主义现代化国家新征程"
# 分词并得到词性
res = posseg.lcut(Text)

Word_flag = {}   # 创建空字典,用于存放元素
for word, flag in res:
    Word_flag[word] = flag
Word_flag
# {'开启': 'v','全面': 'n','建设': 'vn','社会主义': 'n','现代化': 'vn','国家': 'n','新': 'a','征程': 'n'}

The function in the above code posseg.lcut()gets a list whose element type is pair(referring to a tuple containing the participle and its corresponding part of speech). We now use dictionary derivation to achieve this requirement. The code is as follows:

Text = "开启全面建设社会主义现代化国家新征程"
res = posseg.lcut(Text)
Word_flag = {word:key for word, key in res}
Word_flag  # 结果同上

2. Generate a new dictionary based on the old dictionary

We can also use dictionary derivation to modify the original dictionary to obtain a new dictionary. For example, we can change the part of speech of the word segmentation in the previous example to correspond to the Chinese name. The code is as follows:

dict_old = Word_flag  # 复制一份数据,重命名
# 修改原字典的值,将英文词性修改为中文词性
En_Ch = {'v':'动词', 'n':'名词', 'vn':'名动词', 'a':'形容词'}
dict_new = {key:En_Ch[value] for key, value in dict_old.items()}
dict_new

In the above code, we use a function items()to loop through the key-value pairs in the original dictionary. By flexibly defining the expressions  key_expressionand expressions in the dictionary derivation value_expression, we can modify the keys and values ​​of the original dictionary accordingly to form new keys - value pair.

3. Conditional statements in dictionary derivation

When we introduced the dictionary derivation syntax, we mentioned that the conditional judgment statement in the dictionary derivation can not only be placed at the end to filter the return value of the loop, but can also be placed in the expression key_expressionand value_expressionthen modified by selecting different expressions according to the conditions. The return value of the loop. Below we introduce these two usages respectively:

We still use the word segmentation & part-of-speech dictionary mentioned above as the initial dictionary. Suppose now we only want to get the word segmentation & part-of-speech dictionary whose part-of-speech is noun . The implementation code is as follows:

# 复制一份原字典,重命名
dict_old = Word_flag
# 筛选循环的值,提取词性为名词的分词
dict_new = {key:value for key, value in dict_old.items() if value == 'n'}
dict_new
# {'全面': 'n', '社会主义': 'n', '国家': 'n', '征程': 'n'}

In this example, we iffilter the values ​​of the loop through conditional statements. Only when the value is present value=='n', the values ​​of the loop will be stored in the dictionary in the form of key-value pairs. If now we only want to know whether the part of speech of the word segmentation is a noun, that is, we will get a new dictionary. At this time, the key of the dictionary is still the word segmentation result in the initial dictionary, but when the part of speech of the word segmentation in the 'n'initial dictionary is, the value of the new dictionary is "noun", otherwise the value of the dictionary is "non-noun" . The implementation code is as follows:

# 复制一份原字典,重命名
dict_old = Word_flag
# 为键值对设置不同值
dict_new = {key:"名词" if value == "n" else "非名词" for key, value in dict_old.items()}
dict_new
# {'开启': '非名词','全面': '名词','建设': '非名词','社会主义': '名词','现代化': '非名词','国家': '名词','新': '非名词','征程': '名词'}

Part4Summary

The list comprehensions and dictionary comprehensions introduced in this article provide a concise, flexible, and Python-style list/dictionary construction method. Compared with using traditional for loops and conditional statements, derivation can make the code easier to read and understand, and can also save time. However, although derivation is a powerful tool, it cannot be said that using derivation to create lists and dictionaries is the optimal solution. There is no one-size-fits-all method. In some cases with relatively complex logic (such as Using nested derivation) may sacrifice the readability of the code, so in actual operation we should choose between various methods according to our own needs.

Part5Python Tutorial

 Table of contents

picture

picture

Recommended in the past

Python teaching | Pandas time data processing method

Python teaching | Pandas function application (apply/map) [Part 2]

Python teaching | Pandas function application (apply/map) [Part 1]

Python teaching | Pandas data matching (including practical cases)

Python teaching | Pandas data merging (including directory file merging cases)

Guess you like

Origin blog.csdn.net/weixin_55633225/article/details/132166624