Chapter 14 python learning knowledge record (1)


foreword

It mainly records some problems encountered in the python learning process.

14.1 Using numpy

The numpy package is the basis of artificial intelligence algorithms. This section briefly introduces the use of its package.

14.1.1 Numerical operations

code:

import numpy as np
## 生成数组
x = np.array([1.0, 2.0, 3.0])
print(x)
type(x)

## 算术运算
x = np.array([1.0, 2.0, 3.0])
y = np.array([5.0, 2.0, 3.0])
plus = x+y
print("plus", plus, "\n")

minus= x-y
print("minus", minus, "\n")

multiply = x*y
print("multiply", multiply, "\n")

devide = x/y
print("devide", devide, "\n")

output :

[1. 2. 3.]
plus [6. 4. 6.]

minus [-4. 0. 0.]

multiply [5. 4. 9.]

devide [0.2 1. 1. ]

14.1.2 N-dimensional arrays

code:

import numpy as np
A = np.array([[1, 2], [3, 4]])
print(A)
print(A.shape)
print(A.dtype)

output :

[[1 2]
[3 4]]
(2, 2)
int32

14.1.3 Matrix operation and broadcasting

code:

import numpy as np
##  矩阵运算
A = np.array([[1, 2], [3, 4]])
B = np.array([[3, 0],[0, 6]])
print("A+B", A+B, "\n")
print("A-B", A+B, "\n")
print("A*B", A*B, "\n")
print("A*10", A*10, "\n")
##  广播
# 不同维数的数组相乘
A = np.array([[1, 2], [3, 4]])
B = np.array([10, 20])
print("A*B,广播", A*B, "\n")

output :

A+B [[ 4 2]
[ 3 10]]

AB [[ 4 2]
[ 3 10]]

A B [[ 3 0]
[ 0 24]]

A
10 [[10 20]
[30 40]]

A* B, spread [[10 40]
[30 80]]

14.1.4 Element access

code:

import numpy as np
# 元素的索引从0开始
X = np.array([[51, 55], [14, 19], [0, 4]])
print(X)
print("X[0]", X[0]) # 第0行
print("X[0][1]", X[0][1]) # (0,1)的元素

# 用for语句访问元素
X = np.array([[51, 55], [14, 19], [0, 4]])
for row in X:
    print(row)

# 使用数组访问元素
X = np.array([[51, 55], [14, 19], [0, 4]])
X = X.flatten()
print("X[np.array([0, 2, 4])]", X[np.array([0, 2, 4])])# 获取索引为0、2、4的元素
print(X)

# 获取满足一定条件的元素
X = np.array([[51, 55], [14, 19], [0, 4]])
X = X.flatten()
print("X > 15", X > 15)
print("X[X>15]", X[X>15])

output :

[[51 55]
[14 19]
[ 0 4]]
X[0] [51 55]
X[0][1] 55
[51 55]
[14 19]
[0 4]
X[np.array([0, 2, 4])] [51 14 0]
[51 55 14 19 0 4]
X > 15 [ True True False True False False]
X[X>15] [51 55 19]

14.2 Using Matplotlib

Matplotlib is a package of python, which can be used to draw some graphics. The specific examples are as follows.

14.2.1 Drawing simple graphics

code:

import numpy as np
import matplotlib.pyplot as plt

# 生成数据
x = np.arange(0, 6, 0.1) # 以0.1为单位,生成0到6的数据
y = np.sin(x)

# 绘制图形
plt.plot(x, y)
plt.show()

output :

forerunner

14.2.2 Drawing Complex Graphics

code:

import matplotlib.pyplot as plt
# 生成数据
x = np.arange(0, 6, 0.1) # 以0.1为单位,生成0到6的数据
y1 = np.sin(x)
y2 = np.cos(x)

# 绘制图形
plt.plot(x, y1, label="sin")# sin系列标签
plt.plot(x, y2, linestyle = "--", label="cos") # 用虚线绘制,cos系列标签
plt.xlabel("x") # x轴标签
plt.ylabel("y") # y轴标签
plt.title('sin & cos') # 标题
plt.legend()#  显示系列标签
plt.show()

output :

Please add a picture description

14.2.3 Display pictures

code:

from matplotlib.image import imread
img = imread('字符串驻留.PNG') # 读入图像(设定合适的路径!)
plt.imshow(img)
plt.show()

output:

Please add a picture description

14.3 os functions

os is used to operate the file directory, this section will give a brief introduction to its operation.

14.3.1 Get file path

Perform related operations after importing the os.path function.
File composition :

insert image description here

code:

import os.path
print(os.path.abspath('document_try/neu.jpg'))#获取文件的绝对路径
print(os.path.exists('document_try/neu.jpg'))#判断文件目录是否存在,存在返回True,不存在返回False
print(os.path.abspath('document_try/兰州交通大学.PNG'))#获取文件的绝对路径
print(os.path.exists('document_try/兰州交通大学.PNG'))#判断文件目录是否存在,存在返回True,不存在返回False

output:

D:\document\document\major_study\Natural Language Processing\Natural Language Processing: Method Based on Pre-training Model (Practice and Notes)\chp4\13 Document Exercise\bneu.jpg False D:\document\document\major_study\
Natural
Language Processing\Natural Language Processing: Methods Based on Pre-trained Models (Practice and Notes)\chp4\document_try\Lanzhou Jiaotong University.PNG
True

14.3.2 Basic operation of paths

Perform related operations after importing the os.path function, without changing the relationship of the actual file path, but only perform related operations in the given path string.

import os.path
print(os.path.join('document_try','document_try2'))#将目录与目录或文件名拼接起来
print(os.path.split('\document_try\兰州交通大学.PNG'))#分离目录名与文件名
print(os.path.splitext('document_try/兰州交通大学.PNG'))#分离文件名与扩展名
print(os.path.basename('D:\document\document\major_study\python 学习/13文件练习/bneu.jpg'))#从一个目录中提取文件名
print(os.path.dirname('D:\document\document\major_study\python 学习/13文件练习/bneu.jpg'))#从一个路径中提取文件路径,不包含文件名
print(os.path.isdir('D:\document\document\major_study\python 学习/13文件练习/bneu.jpg'))#判断是否为路径

output:

document_try\document_try2
('\document_try', 'Lanzhou Jiaotong University.PNG')
('document_try/Lanzhou Jiaotong University', '.PNG')
bneu.jpg
D:\document\document\major_study\python learning/13 file practice
False


14.4 Use of tqdm

References: tqdm usage and examples

Function: When using Python to process time-consuming operations, in order to facilitate the observation of the processing progress, it is necessary to visualize the processing status through the progress bar so that we can keep abreast of the situation.

Advantages : tqdm can perfectly support and solve these problems. It can output processing progress in real time and occupy very little CPU resources. It supports windows, Linux, mac and other systems, and supports ① loop processing, ② multi-process, ③ recursive processing, and return You can combine Linux commands to view the processing status and wait for the progress display.

14.4.1 Import and use of tqdm

You can import the tqdm package through import. There are two code import methods shown below. It is recommended to use method 1, because method 1 imports a lib, and method 2 imports the tqdm.tqdm method. There is no way to use method 2 to import Use other methods such as tqdm.trange().
For objects that can be iterated, tqdm can be used to encapsulate the progress of visualization, as shown in the following code.
the code

# 导入包
import tqdm # 方法1
# from tqdm import tqdm # 方法2
import time

# 定义一个可迭代对象
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# 正常的遍历(没有进度条)
for idx, element in enumerate(a):
    print(f"No.{
      
      idx}: {
      
      element}")

# 使用tqdm对可迭代对象进行包装,实现进度条可视化
for idx, element in enumerate(tqdm.tqdm(a)):
    time.sleep(0.5)
    print(f"No.{
      
      idx}: {
      
      element}")

output

No.0: 1
No.1: 2
No.2: 3
No.3: 4
No.4: 5
No.5: 6
No.6: 7
No.7: 8
No.8: 9
No.9: 10
10%|█ | 1/10 [00:01<00:09, 1.01s/it]No.0: 1
No.1: 2
30%|███ | 3/10 [00:03<00:07, 1.01s/it]No.2: 3
No.3: 4
50%|█████ | 5/10 [00:05<00:05, 1.01s/it]No.4: 5
60%|██████ | 6/10 [00:06<00:04, 1.01s/it]No.5: 6
No.6: 7
80%|████████ | 8/10 [00:08<00:02, 1.01s/it]No.7: 8
No.8: 9
100%|██████████| 10/10 [00:10<00:00, 1.01s/it]
No.9: 10

It can be seen that the output progress bar will output one line continuously until all the results are output. This is because the content of each print is different, so it will be able to display new content. Because the program shown below has no console output, there is only one progress bar.
code :

# 定义一个可迭代对象
a = [1, 4, 7, 2, 5, 8, 3, 6, 9, 10]
# 使用tqdm对可迭代对象进行包装,实现进度条可视化
for idx, element in enumerate(tqdm.tqdm(a)):
    time.sleep(0.5)

output:

insert image description here

14.4.2 With parameterstqdm.tqdm()

Add some information of your own to the progress bar.

@staticmethod
def format_meter(n, total, elapsed, ncols=None, prefix='', ascii=False, unit='it',
                 unit_scale=False, rate=None, bar_format=None, postfix=None,
                 unit_divisor=1000, initial=0, colour=None, **extra_kwargs):
  • iterable: Iterable object, no need to set when updating manually
  • desc: string, description text of the left progress bar
  • total: total number of items
  • leave: bool value, whether to keep the progress bar after the iteration is completed
  • file: The output points to the location, the default is the terminal, and generally does not need to be set
  • ncols: Adjust the width of the progress bar, the default is to automatically adjust the length according to the environment, if it is set to 0, there will be no progress bar, only the output information
  • unit: The text describing the processing unit, the default is it, for example: 100 it/s, if processing photos, set it to img, otherwise it is 100 img/s
  • unit_scale: Automatically convert project processing speed units according to international standards, such as 100000 it/s >> 100k it/s
  • colour: progress bar color

Example 1

code:

import tqdm
import time
d = {
    
    '测试集比例':0.1,'训练集比例':0.8}

"""desc设置名称
    ncols设置进度条长度 -> 建议设置在100以内
    postfix以字典形式传入详细信息
"""
for i in tqdm.tqdm(range(50),desc='训练过程进度',ncols=100,postfix=d):
    time.sleep(0.1)

output:

insert image description here


Example 2

code:

import tqdm
import time
d = {
    
    '测试集比例':0.1,'训练集比例':0.8}

"""desc设置名称
    ncols设置进度条长度 -> 建议设置在100以内
    postfix以字典形式传入详细信息
"""
for i in tqdm.tqdm(range(50),desc='训练过程进度',ncols=100,postfix=d):
    time.sleep(0.1)

output:

insert image description here

14.4.3 Custom progress bar display information

To dynamically output some information in the progress bar, you can use set_description()the set_postfix()method set_description()to display it before and set_postfix()after the progress bar.
code:

import tqdm
import time
import random

# 在使用set_description时一般会创建一个tqdm.tqdm()对象
pbar = tqdm.tqdm(["a", "b", "c", "d", "e", "f", "g"])
for idx, element in enumerate(pbar):
    time.sleep(0.2)
    pbar.set_description(f"No.{
      
      idx}: {
      
      element}")


# 在使用set_description()和set_postfix()时一般会创建一个tqdm.tqdm()对象
epochs = 150
pbar = tqdm.tqdm(range(epochs), ncols=100)  # ncols设置进度条显示的字符长度,小了就显示不全了
for idx, element in enumerate(pbar):
    time.sleep(0.01)
    pbar.set_description(f"Epoch {
      
      idx}/{
      
      epochs}")
    pbar.set_postfix({
    
    "class": element}, loss=random.random(), cost_time = random.randrange(0, 100))

output:

insert image description here


Note :
set_descriptionand set_postfixboth use kwargs to pass parameters, so we can:

  1. Use dictionary to pass parameters -> pbar.set_postfix({"key_1": "value_1", ...})
  2. Pass parameters directly with keywords ->pbar.set_postfix(key_1 = value_1, key_2 = value_2, ...)
  3. mixed with ->pbar.set_postfix({"key_1": value_1, "key_2": value_2, ...}, key_3 = value_3, ...)

14.4.4 tqdm built-in methods for generating iterable objects

In tqdm.tqdm(range(xxx))addition to this way of writing, tqdm also provides a similar method, that is tqdm.trange(xxx), the code is as follows:

import tqdm
import time
pbar = tqdm.trange(200, 400, 2)
for idx, element in enumerate(pbar):
    time.sleep(0.1)
    pbar.set_description(f"No.{
      
      idx} -> {
      
      element}")

output :

insert image description here

14.4.5 Custom method update progress

Sometimes we don't just update the progress bar through a for training, we also want to update the progress bar once after some operations are done, the code is as follows:

import tqdm
import time
with tqdm.tqdm(total=10) as bar:  # total为进度条总的迭代次数
    # 操作1
    time.sleep(1)
    # 更新进度条
    bar.update(3)  # bar.update()里面的数表示更新的次数,和optimizer.step方法类似
    # 操作2
    time.sleep(3)
    # 更新进度条
    bar.update(1)
    # 操作3
    time.sleep(1)
    # 更新进度条
    bar.update(6)  # 建议不要超过total

output :

insert image description here

14.5 log module

The log module is an alternative to the log output function, the specific operation is as follows.

14.5.1 Use of logs

code :

import sys
from time import strftime, localtime
#导入log模块
import logging

logger = logging.getLogger()  # 创建一个日志
logger.setLevel(logging.INFO) # 设置日志为普通信息、确认程序按照预期进行
logger.addHandler(logging.StreamHandler(sys.stdout)) # 将日志添加控制台的输出中

logger.info('这是要写入日志中的数据: {}'.format("ZZ-7845124")) # 将数据写入日志
logger.info('这是要写入日志中的时间: {}'.format(strftime("%y%m%d-%H%M", localtime()))) # 将数据写入日志

output :

This is the data to be written to the log: ZZ-7845124
This is the time to be written to the log: 230203-1519

14.5.2 Log saving

This section not only outputs the log content to the output window, but also saves the log content to a file with the suffix ".log".
The implementation code is as follows:

import sys
from time import strftime, localtime
#导入log模块
#导入log模块
import logging

logger = logging.getLogger()  # 创建一个日志
logger.setLevel(logging.INFO) # 设置日志为普通信息、确认程序按照预期进行
logger.addHandler(logging.StreamHandler(sys.stdout)) # 将日志添加控制台的输出中
log_file = '{}-{}.log'.format("日志", strftime("%y%m%d-%H%M", localtime()))  # 设置日志名称
logger.addHandler(logging.FileHandler(log_file))  # 将日志添加到log_file文件中。
logger.info('这是要写入日志中的数据: {}'.format("ZZ-7845124")) # 将数据写入日志
logger.info('这是要写入日志中的时间: {}'.format(strftime("%y%m%d-%H%M", localtime()))) # 将数据写入日志

output:

This is the data to be written to the log: ZZ-7845124
This is the time to be written to the log: 230203-1530

generate:

A file named "log-230203-1530.log" is generated, and the contents of the file are as follows:
insert image description here

Summarize

提示:这里对文章进行总结:

For example: the above is what we will talk about today. This article only briefly introduces the use of pandas, and pandas provides a large number of functions and methods that allow us to process data quickly and easily.

Guess you like

Origin blog.csdn.net/qq_40940944/article/details/128845183