Chapter 7: Strings and common data structures

See previous articles:
Chapter 2: Language Elements
Chapter 3: Branch Structure
Chapter 4: Loop Structure
Chapter 5: Constructing Program Logic
Chapter 6: Use of Functions and Modules
Or go to the column "Python Tutorial" to view

Resource directory: Code (7)
Article resource download: (1-15 chapters)
Link: https://pan.baidu.com/s/1Mh1knjT4Wwt__7A9eqHmng?pwd=t2j3
Extraction code: t2j3

Article directory

- Chapter 7: Strings and common data structures

use string

The Second World War prompted the birth of modern electronic computers. Initially, computers were used to calculate missile trajectories. For many years after the birth of computers, the information processed by computers was basically numerical information. The world's first electronic computer is called ENIAC (Electronic Numerical Integral Computer), which was born at the University of Pennsylvania in the United States, and it can complete about 5,000 floating-point operations per second. As time goes by, although numerical calculations are still one of the most important things in the daily work of computers, more data processed by today's computers may exist in the form of text. If we want to operate these through Python programs For text information, you must first understand the string type and its related knowledge.

The so-called string is a finite sequence consisting of zero or more characters, generally recorded as . In a Python program, if we surround single or multiple characters with single quotes or double quotes, we can represent a string.

s1 = 'hello, world!'
s2 = "hello, world!"
# 以三个双引号或单引号开头的字符串可以折行
s3 = """
hello, 
world!
"""
print(s1, s2, s3, end='')

\You can use (backslash) in the string to represent escape, which means that \the following characters no longer have its original meaning, for example: \ninstead of representing backslash and character n, it represents newline; instead of \trepresenting A backslash and the character t represent a tab character instead. 'So if you want to express it in a string , you should write it \'as the ideal expression . You can run the code below to see what the output will be.\\\

s1 = '\'hello, world!\''
s2 = '\n\\hello, world!\\\n'
print(s1, s2, end='')

It can also be followed \by an octal or hexadecimal number to represent characters. For example, \141and \x61both represent lowercase letters a. The former is an octal notation, and the latter is a hexadecimal notation. It can also be \followed by a Unicode character encoding to represent a character, for example, \u9a86\u660ait represents the Chinese "Luo Hao". Run the code below to see what the output is.

s1 = '\141\142\143\x61\x62\x63'
s2 = '\u9a86\u660a'
print(s1, s2)

\If you don't want the expression in the string to be escaped, we can radd a letter at the beginning of the string to explain it, and then see what the following code will output.

s1 = r'\'hello, world!\''
s2 = r'\n\\hello, world!\\\n'
print(s1, s2, end='')

Python provides a very rich set of operators for string types. We can use +operators to concatenate strings. We can use *operators to repeat the contents of a string. We can use inand not into determine whether a string contains another character. string (membership operation), we can also use the []AND [:]operator to extract a certain character or some characters from the string (slicing operation), the code is as follows.

s1 = 'hello ' * 3
print(s1) # hello hello hello 
s2 = 'world'
s1 += s2
print(s1) # hello hello hello world
print('ll' in s1) # True
print('good' in s1) # False
str2 = 'abc123456'
# 从字符串中取出指定位置的字符(下标运算)
print(str2[2]) # c
# 字符串切片(从指定的开始索引到指定的结束索引)
print(str2[2:5]) # c12
print(str2[2:]) # c123456
print(str2[2::2]) # c246
print(str2[::2]) # ac246
print(str2[::-1]) # 654321cba
print(str2[-3:-1]) # 45

In Python, we can also complete the processing of strings through a series of methods, the code is as follows.

str1 = 'hello, world!'
# 通过内置函数len计算字符串的长度
print(len(str1)) # 13
# 获得字符串首字母大写的拷贝
print(str1.capitalize()) # Hello, world!
# 获得字符串每个单词首字母大写的拷贝
print(str1.title()) # Hello, World!
# 获得字符串变大写后的拷贝
print(str1.upper()) # HELLO, WORLD!
# 从字符串中查找子串所在位置
print(str1.find('or')) # 8
print(str1.find('shit')) # -1
# 与find类似但找不到子串时会引发异常
# print(str1.index('or'))
# print(str1.index('shit'))
# 检查字符串是否以指定的字符串开头
print(str1.startswith('He')) # False
print(str1.startswith('hel')) # True
# 检查字符串是否以指定的字符串结尾
print(str1.endswith('!')) # True
# 将字符串以指定的宽度居中并在两侧填充指定的字符
print(str1.center(50, '*'))
# 将字符串以指定的宽度靠右放置左侧填充指定的字符
print(str1.rjust(50, ' '))
str2 = 'abc123456'
# 检查字符串是否由数字构成
print(str2.isdigit())  # False
# 检查字符串是否以字母构成
print(str2.isalpha())  # False
# 检查字符串是否以数字和字母构成
print(str2.isalnum())  # True
str3 = '  [email protected] '
print(str3)
# 获得字符串修剪左右两侧空格之后的拷贝
print(str3.strip())

As we said before, the output string can be formatted in the following ways.

a, b = 5, 10
print('%d * %d = %d' % (a, b, a * b))

Of course, we can also use the method provided by the string to complete the format of the string, the code is as follows.

a, b = 5, 10
print('{0} * {1} = {2}'.format(a, b, a * b))

After Python 3.6, there is a more concise way of writing formatted strings, which is to add letters before the string f. We can use the following syntactic sugar to simplify the above code.

a, b = 5, 10
print(f'{a} * {b} = {a * b}')

In addition to strings, Python also has a variety of built-in data structures. If you want to save and manipulate data in a program, most of the time you can use existing data structures to achieve it. The most commonly used ones include lists, tuples, sets, and dictionary.

use list

I don’t know if you have noticed that there are some differences between the string type ( str) we just mentioned and the numeric type ( intand ) we talked about before. floatThe numeric type is a scalar type, which means that objects of this type have no accessible internal structure; the string type is a structured, non-scalar type, so it has a series of properties and methods. The list ( ) we will introduce next listis also a structured, non-scalar type. It is an ordered sequence of values. Each value can be identified by an index. The definition list can put the elements of the list in [], Multiple elements are ,separated by , you can use fora loop to traverse the list elements, or you can use the []OR [:]operator to extract one or more elements in the list.

The following code demonstrates how to define a list, how to traverse the list and the subscript operation of the list.

list1 = [1, 3, 5, 7, 100]
print(list1) # [1, 3, 5, 7, 100]
# 乘号表示列表元素的重复
list2 = ['hello'] * 3
print(list2) # ['hello', 'hello', 'hello']
# 计算列表长度(元素个数)
print(len(list1)) # 5
# 下标(索引)运算
print(list1[0]) # 1
print(list1[4]) # 100
# print(list1[5])  # IndexError: list index out of range
print(list1[-1]) # 100
print(list1[-3]) # 5
list1[2] = 300
print(list1) # [1, 3, 300, 7, 100]
# 通过循环用下标遍历列表元素
for index in range(len(list1)):
    print(list1[index])
# 通过for循环遍历列表元素
for elem in list1:
    print(elem)
# 通过enumerate函数处理列表之后再遍历可以同时获得元素索引和值
for index, elem in enumerate(list1):
    print(index, elem)

The following code demonstrates how to add and remove elements from a list.

list1 = [1, 3, 5, 7, 100]
# 添加元素
list1.append(200)
list1.insert(1, 400)
# 合并两个列表
# list1.extend([1000, 2000])
list1 += [1000, 2000]
print(list1) # [1, 400, 3, 5, 7, 100, 200, 1000, 2000]
print(len(list1)) # 9
# 先通过成员运算判断元素是否在列表中，如果存在就删除该元素
if 3 in list1:
	list1.remove(3)
if 1234 in list1:
    list1.remove(1234)
print(list1) # [1, 400, 5, 7, 100, 200, 1000, 2000]
# 从指定的位置删除元素
list1.pop(0)
list1.pop(len(list1) - 1)
print(list1) # [400, 5, 7, 100, 200, 1000]
# 清空列表元素
list1.clear()
print(list1) # []

Like strings, lists can also be sliced. By slicing, we can copy the list or take out a part of the list to create a new list. The code is as follows.

fruits = ['grape', 'apple', 'strawberry', 'waxberry']
fruits += ['pitaya', 'pear', 'mango']
# 列表切片
fruits2 = fruits[1:4]
print(fruits2) # apple strawberry waxberry
# 可以通过完整切片操作来复制列表
fruits3 = fruits[:]
print(fruits3) # ['grape', 'apple', 'strawberry', 'waxberry', 'pitaya', 'pear', 'mango']
fruits4 = fruits[-3:-1]
print(fruits4) # ['pitaya', 'pear']
# 可以通过反向切片操作来获得倒转后的列表的拷贝
fruits5 = fruits[::-1]
print(fruits5) # ['mango', 'pear', 'pitaya', 'waxberry', 'strawberry', 'apple', 'grape']

The following code implements the sorting operation on the list.

list1 = ['orange', 'apple', 'zoo', 'internationalization', 'blueberry']
list2 = sorted(list1)
# sorted函数返回列表排序后的拷贝不会修改传入的列表
# 函数的设计就应该像sorted函数一样尽可能不产生副作用
list3 = sorted(list1, reverse=True)
# 通过key关键字参数指定根据字符串长度进行排序而不是默认的字母表顺序
list4 = sorted(list1, key=len)
print(list1)
print(list2)
print(list3)
print(list4)
# 给列表对象发出排序消息直接在列表对象上进行排序
list1.sort(reverse=True)
print(list1)

Generatives and Generators

We can also use the generative syntax of lists to create lists, as shown in the code below.

f = [x for x in range(1, 10)]
print(f)
f = [x + y for x in 'ABCDE' for y in '1234567']
print(f)
# 用列表的生成表达式语法创建列表容器
# 用这种语法创建列表之后元素已经准备就绪所以需要耗费较多的内存空间
f = [x ** 2 for x in range(1, 1000)]
print(sys.getsizeof(f))  # 查看对象占用内存的字节数
print(f)
# 请注意下面的代码创建的不是一个列表而是一个生成器对象
# 通过生成器可以获取到数据但它不占用额外的空间存储数据
# 每次需要数据的时候就通过内部的运算得到数据(需要花费额外的时间)
f = (x ** 2 for x in range(1, 1000))
print(sys.getsizeof(f))  # 相比生成式生成器不占用存储数据的空间
print(f)
for val in f:
    print(val)

In addition to the generator syntax mentioned above, there is another way to define a generator in Python, which is to yieldtransform an ordinary function into a generator function through keywords. The following code demonstrates how to implement a generator that produces Fibonacci numbers . The so-called Fibonacci sequence can be defined by the following recursive method:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-Noio6FlR-1679199422415)(./res/fibonacci-blocks.png)]

def fib(n):
    a, b = 0, 1
    for _ in range(n):
        a, b = b, a + b
        yield a


def main():
    for val in fib(20):
        print(val)


if __name__ == '__main__':
    main()

use tuple

A tuple in Python is also a container data type similar to a list. You can use a variable (object) to store multiple data. The difference is that the elements of the tuple cannot be modified. We have used it more than once in the previous code tuples too. As the name implies, we combine multiple elements together to form a tuple, so it can hold multiple pieces of data just like a list. The code below demonstrates how to define and use tuples.

# 定义元组
t = ('骆昊', 38, True, '四川成都')
print(t)
# 获取元组中的元素
print(t[0])
print(t[3])
# 遍历元组中的值
for member in t:
    print(member)
# 重新给元组赋值
# t[0] = '王大锤'  # TypeError
# 变量t重新引用了新的元组原来的元组将被垃圾回收
t = ('王大锤', 20, True, '云南昆明')
print(t)
# 将元组转换成列表
person = list(t)
print(person)
# 列表是可以修改它的元素的
person[0] = '李小龙'
person[1] = 25
print(person)
# 将列表转换成元组
fruits_list = ['apple', 'banana', 'orange']
fruits_tuple = tuple(fruits_list)
print(fruits_tuple)

Here is a question worth exploring. We already have a data structure like a list, why do we need a type like a tuple?

The elements in the tuple cannot be modified. In fact, we may prefer to use those immutable objects in the project, especially in the multi-threaded environment (described later) (on the one hand, because the object state cannot be modified, it can be avoided The unnecessary program errors caused by this simply mean that an immutable object is easier to maintain than a mutable object; on the other hand, because no thread can modify the internal state of an immutable object, an immutable object Automatically is thread-safe, which saves the overhead of handling synchronization. An immutable object can be easily shared access). So the conclusion is: if you don’t need to add, delete, or modify elements, you can consider using tuples. Of course, if a method needs to return multiple values, using tuples is also a good choice.
Tuples are superior to lists in terms of creation time and space. We can use the getsizeof function of the sys module to check how much memory space is occupied by tuples and lists storing the same elements, which is easy to do. We can also use the magic command %timeit in ipython to analyze the time it takes to create tuples and lists with the same content. The figure below is the test result on my macOS system.

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-GvIwmE6A-1679199422418)(./res/ipython-timeit.png)]

use collection

Sets in Python are consistent with mathematical sets, no repeated elements are allowed, and operations such as intersection, union, and difference can be performed.

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-KsHVHAxA-1679199422419)(./res/python-set.png)]

Collections can be created and used as shown in the code below.

# 创建集合的字面量语法
set1 = {1, 2, 3, 3, 3, 2}
print(set1)
print('Length =', len(set1))
# 创建集合的构造器语法(面向对象部分会进行详细讲解)
set2 = set(range(1, 10))
set3 = set((1, 2, 3, 3, 2, 1))
print(set2, set3)
# 创建集合的推导式语法(推导式也可以用于推导集合)
set4 = {num for num in range(1, 100) if num % 3 == 0 or num % 5 == 0}
print(set4)

Adds elements to and removes elements from a collection.

set1.add(4)
set1.add(5)
set2.update([11, 12])
set2.discard(5)
if 4 in set2:
    set2.remove(4)
print(set1, set2)
print(set3.pop())
print(set3)

Set membership, intersection, union, difference and other operations.

# 集合的交集、并集、差集、对称差运算
print(set1 & set2)
# print(set1.intersection(set2))
print(set1 | set2)
# print(set1.union(set2))
print(set1 - set2)
# print(set1.difference(set2))
print(set1 ^ set2)
# print(set1.symmetric_difference(set2))
# 判断子集和超集
print(set2 <= set1)
# print(set2.issubset(set1))
print(set3 <= set1)
# print(set3.issubset(set1))
print(set1 >= set2)
# print(set1.issuperset(set2))
print(set1 >= set3)
# print(set1.issuperset(set3))

Explanation: Python allows some special methods to customize operators for a certain type or data structure (described in later chapters). In the above code, when we operate on a set, we can call the method of the set object. You can also use the corresponding operator directly. For example, &the operator has the same effect as the intersection method, but using the operator makes the code more intuitive.

use dictionary

Dictionaries are another variable container model. Dictionaries in Python are the same as the dictionaries we use in our lives. They can store any type of object. Unlike lists and sets, each element of a dictionary consists of a A "key-value pair" consisting of a key and a value, separated by a colon. The following code demonstrates how to define and use a dictionary.

# 创建字典的字面量语法
scores = {'骆昊': 95, '白元芳': 78, '狄仁杰': 82}
print(scores)
# 创建字典的构造器语法
items1 = dict(one=1, two=2, three=3, four=4)
# 通过zip函数将两个序列压成字典
items2 = dict(zip(['a', 'b', 'c'], '123'))
# 创建字典的推导式语法
items3 = {num: num ** 2 for num in range(1, 10)}
print(items1, items2, items3)
# 通过键可以获取字典中对应的值
print(scores['骆昊'])
print(scores['狄仁杰'])
# 对字典中所有键值对进行遍历
for key in scores:
    print(f'{key}: {scores[key]}')
# 更新字典中的元素
scores['白元芳'] = 65
scores['诸葛王朗'] = 71
scores.update(冷面=67, 方启鹤=85)
print(scores)
if '武则天' in scores:
    print(scores['武则天'])
print(scores.get('武则天'))
# get方法也是通过键获取对应的值但是可以设置默认值
print(scores.get('武则天', 60))
# 删除字典中的元素
print(scores.popitem())
print(scores.popitem())
print(scores.pop('骆昊', 100))
# 清空字典
scores.clear()
print(scores)

practise

Exercise 1: Display the marquee text on the screen.

Reference answer:

import os
import time


def main():
    content = '北京欢迎你为你开天辟地…………'
    while True:
        # 清理屏幕上的输出
        os.system('cls')  # os.system('clear')
        print(content)
        # 休眠200毫秒
        time.sleep(0.2)
        content = content[1:] + content[0]


if __name__ == '__main__':
    main()

Exercise 2: Design a function to generate a verification code of a specified length. The verification code is composed of uppercase and lowercase letters and numbers.

Reference answer:

import random


def generate_code(code_len=4):
    """
    生成指定长度的验证码

    :param code_len: 验证码的长度(默认4个字符)

    :return: 由大小写英文字母和数字构成的随机验证码
    """
    all_chars = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    last_pos = len(all_chars) - 1
    code = ''
    for _ in range(code_len):
        index = random.randint(0, last_pos)
        code += all_chars[index]
    return code

Exercise 3: Design a function to return the suffix of a given file name.

Reference answer:

def get_suffix(filename, has_dot=False):
    """
    获取文件名的后缀名

    :param filename: 文件名
    :param has_dot: 返回的后缀名是否需要带点
    :return: 文件的后缀名
    """
    pos = filename.rfind('.')
    if 0 < pos < len(filename) - 1:
        index = pos if has_dot else pos + 1
        return filename[index:]
    else:
        return ''

Exercise 4: Design a function that returns the value of the largest and second largest element in the passed list.

Reference answer:

def max2(x):
    m1, m2 = (x[0], x[1]) if x[0] > x[1] else (x[1], x[0])
    for index in range(2, len(x)):
        if x[index] > m1:
            m2 = m1
            m1 = x[index]
        elif x[index] > m2:
            m2 = x[index]
    return m1, m2

Exercise 5: Calculate the date of the year that the specified year, month, and day are.

Reference answer:

def is_leap_year(year):
    """
    判断指定的年份是不是闰年

    :param year: 年份
    :return: 闰年返回True平年返回False
    """
    return year % 4 == 0 and year % 100 != 0 or year % 400 == 0


def which_day(year, month, date):
    """
    计算传入的日期是这一年的第几天

    :param year: 年
    :param month: 月
    :param date: 日
    :return: 第几天
    """
    days_of_month = [
        [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],
        [31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
    ][is_leap_year(year)]
    total = 0
    for index in range(month - 1):
        total += days_of_month[index]
    return total + date


def main():
    print(which_day(1980, 11, 28))
    print(which_day(1981, 12, 31))
    print(which_day(2018, 1, 1))
    print(which_day(2016, 3, 1))


if __name__ == '__main__':
    main()

Exercise 6: Print Yang Hui triangle .

Reference answer:

def main():
    num = int(input('Number of rows: '))
    yh = [[]] * num
    for row in range(len(yh)):
        yh[row] = [None] * (row + 1)
        for col in range(len(yh[row])):
            if col == 0 or col == row:
                yh[row][col] = 1
            else:
                yh[row][col] = yh[row - 1][col] + yh[row - 1][col - 1]
            print(yh[row][col], end='\t')
        print()


if __name__ == '__main__':
    main()

Comprehensive case

Case 1: Double-color ball selection.

from random import randrange, randint, sample


def display(balls):
    """
    输出列表中的双色球号码
    """
    for index, ball in enumerate(balls):
        if index == len(balls) - 1:
            print('|', end=' ')
        print('%02d' % ball, end=' ')
    print()


def random_select():
    """
    随机选择一组号码
    """
    red_balls = [x for x in range(1, 34)]
    selected_balls = []
    selected_balls = sample(red_balls, 6)
    selected_balls.sort()
    selected_balls.append(randint(1, 16))
    return selected_balls


def main():
    n = int(input('机选几注: '))
    for _ in range(n):
        display(random_select())


if __name__ == '__main__':
    main()

Explanation: The sample function of the random module is used above to select n elements that are not repeated from the list.

Comprehensive Case 2: The Joseph Ring Problem .

"""
《幸运的基督徒》
有15个基督徒和15个非基督徒在海上遇险，为了能让一部分人活下来不得不将其中15个人扔到海里面去，有个人想了个办法就是大家围成一个圈，由某个人开始从1报数，报到9的人就扔到海里面，他后面的人接着从1开始报数，报到9的人继续扔到海里面，直到扔掉15个人。由于上帝的保佑，15个基督徒都幸免于难，问这些人最开始是怎么站的，哪些位置是基督徒哪些位置是非基督徒。
"""


def main():
    persons = [True] * 30
    counter, index, number = 0, 0, 0
    while counter < 15:
        if persons[index]:
            number += 1
            if number == 9:
                persons[index] = False
                counter += 1
                number = 0
        index += 1
        index %= 30
    for person in persons:
        print('基' if person else '非', end='')


if __name__ == '__main__':
    main()

Comprehensive Case 3: Tic Tac Toe Game.

import os


def print_board(board):
    print(board['TL'] + '|' + board['TM'] + '|' + board['TR'])
    print('-+-+-')
    print(board['ML'] + '|' + board['MM'] + '|' + board['MR'])
    print('-+-+-')
    print(board['BL'] + '|' + board['BM'] + '|' + board['BR'])


def main():
    init_board = {
        'TL': ' ', 'TM': ' ', 'TR': ' ',
        'ML': ' ', 'MM': ' ', 'MR': ' ',
        'BL': ' ', 'BM': ' ', 'BR': ' '
    }
    begin = True
    while begin:
        curr_board = init_board.copy()
        begin = False
        turn = 'x'
        counter = 0
        os.system('clear')
        print_board(curr_board)
        while counter < 9:
            move = input('轮到%s走棋, 请输入位置: ' % turn)
            if curr_board[move] == ' ':
                counter += 1
                curr_board[move] = turn
                if turn == 'x':
                    turn = 'o'
                else:
                    turn = 'x'
            os.system('clear')
            print_board(curr_board)
        choice = input('再玩一局?(yes|no)')
        begin = choice == 'yes'


if __name__ == '__main__':
    main()

Explanation: The last case comes from the book "Python Programming Quick Start: Automating Trivial Work" (this book is still a good choice for those who have programming foundation and want to quickly use Python to automate daily work), and the code has been done. Little tweaks.