Basic learning - read txt data, convert string to list or array, draw PR curve, draw Loss curve

Series Article Directory

Basic learning - record about convolutional layer
Basic learning - nn.Unfold batch slicing, F.conv2d specified convolution kernel two-dimensional convolution operation, nn.Conv2d convolutional layer
basic learning - read txt data, string conversion list or array, draw PR curve, draw Loss curve



string to array

.split() is a string method in Python, which can split a string into multiple substrings according to the specified delimiter, and return a list consisting of these substrings. For example, if we have a string "hello world", we can use the .split() method to split it into two substrings ["hello", "world"] separated by spaces. If no delimiter is specified, a space is used as the delimiter by default.

# 将逗号分隔的字符串转换为数组
data = 'huang,chao,yang'
data = data.split(',')
print(data)  

# ---------------------------------------------

#  将空格分隔的字符串转换为数组
data = 'huang,chao,yang'
data = data.split(' ')
print(data)  

# ---------------------------------------------

# 将字符串转换为字符数组
data = 'Chaoy'
data = list(data)
print(data) 

# ---------------------------------------------

#  将字符串转换为整数数组
data = '1,2,3'
data = list(int(char) for char in data.split(','))
print(data) 

# ---------------------------------------------
#  将字符串转换为单元素数组
data = 'Chaoy'
data = [data]
print(data)

Array to List in String

literal_eval()

from ast import literal_eval

data= '[1,2,3,4]'
data= literal_eval(data)
print(data)  

Convert a list of integers

data= '1,2,3'
data= list(int(digit) for digit in data.split(','))
print(data)  

Read the data, then draw the PR curve

read txt data

Read data similar to this and convert it into a list, and then draw a picture.
insert image description here

First read the txt content

with open('file.txt', 'r') as f:    
	content = f.read()    
	print(content)
data_lines = "results-yolo7.txt"  # 数据路径
    with open(data_lines, "r") as f:  # 读txt文本数据
        data = f.readlines()

About the PR curve

The confusion matrix is ​​shown in Table 3: all samples are divided into four types, positive samples are detected as positive samples are TP, positive samples are detected as negative samples are FN, negative samples are detected as positive samples are FP, and negative samples are detected as positive samples Detected as a counter sample is TN. The total number of all correctly predicted samples is the sum of TP and TN, and the total number of all predicted positive samples is the sum of TP and FP. TP and FN are all actual positive samples.
insert image description here
Accuracy refers to the percentage of all correctly predicted positive samples to the predicted positive samples.
P reicision = TP / ( TP + FP ) Preicision = TP/(TP+FP)Preicision=TP/(TP+FP )
The recall rate refers to the percentage of all correctly predicted positive samples to all positive samples.
R ecall = TP / ( TP + FN ) Recall = TP/(TP+FN)Recall=TP/(TP+FN )
The PR curve refers to the precision as the vertical axis and the recall rate as the horizontal axis to obtain the precision-recall curve.
AP is the average accuracy and mAP is the sum of the average accuracy of all classes divided by the number of classes.
The area of ​​the polygon enclosed by the R curve and the coordinate axes is the AP value for each category.

PR curve refers to the relationship curve between precision (Precision) and recall rate (Recall), which is a common method for evaluating the performance of information retrieval systems, text classification systems, etc. In the PR curve, the horizontal axis represents the recall rate, the vertical axis represents the precision, and the coordinates of each point on the curve represent the maximum precision under the recall rate. Usually, the PR curve is an increasing curve, which means that as the recall rate increases, the precision of the system will increase accordingly. In the field of information retrieval, PR curve is one of the important indicators to evaluate system performance.

the code

The code comments are very detailed and will not be explained here.

import numpy as np
from ast import literal_eval
import matplotlib.pyplot as plt
import os
# ----------------------------------------------------------- #
data_lines1 = "input_data/results-yolo5.txt"  # 数据路径1
data_lines2 = "input_data/results-yolo7.txt"  # 数据路径2
data_lines3 = "input_data/results-yolo8.txt"  # 数据路径3
image_path = 'out_data'                       # PR曲线保存路径
# Dog、Cat、Fish、Bird、Pig
name = 'Pig AP '                 # 设定那个一个类别的PR曲线   画哪一个 把名称改了就行了  后面的AP不要动
# ----------------------------------------------------------- #

# 数据读取函数
def read_data(data_lines,name):

    Precision_all = []
    Recall_all = []

    with open(data_lines, "r") as f:  # 读txt文本数据
        data = f.readlines()

    for i in range(len(data)):  # 遍历每一行
        zancun = data[i]                # 读第i行
        if i < len(data) - 2:           # 防止溢出
            zancun_p = data[i + 1]      # 读第i+1行
            zancun_r = data[i + 2]      # 读第i+2行
        if zancun[9:-1] == name:        # 根据名称判断是哪一个类别
            Precision = zancun_p[12:]   # 把str里的Precision数据取出来
            Recall = zancun_r[9:]       # 把str里的Recall数据取出来
            Precision = literal_eval(Precision)  # 字符串转列表
            Recall = literal_eval(Recall)        # 字符串转列表

            for i in range(len(Precision)):
                Precision_all.append(float(Precision[i]))  # 转float
                Recall_all.append(float(Recall[i]))        # 转float

    return Precision_all,Recall_all

Precision_yolo5,Recall_yolo5 = read_data(data_lines1,name)  # 读取数据1
Precision_yolo7,Recall_yolo7 = read_data(data_lines2,name)  # 读取数据2
Precision_yolo8,Recall_yolo8 = read_data(data_lines3,name)  # 读取数据3

plt.xlabel("Recall")        # x轴标签
plt.ylabel("Precision")     # y轴标签
plt.title(name)             # 图片名字
plt.xlim(0.0,1.0)           # x轴量程
plt.ylim(0.0,1.05)          # y轴量程

plt.plot(Recall_yolo5,Precision_yolo5,linewidth=2.0,linestyle='-')  # 画线1
plt.plot(Recall_yolo7,Precision_yolo7,linewidth=2.0,linestyle='-')  # 画线2
plt.plot(Recall_yolo8,Precision_yolo8,linewidth=2.0,linestyle='-')  # 画线3

plt.legend(labels=["yolo5","yolo7", "yolo8"], ncol=3)   # 设定线的标签

if not os.path.exists(image_path):  # 如果image文件夹不存在,则创建
    os.mkdir(image_path)  # 创建保存图片的文件夹

plt.savefig("{image_path}/{name}.png".format(image_path=image_path,name=name), dpi=300)  # 保存PR曲线图片
plt.show()  # 显示

result:
insert image description here

Read data and draw Loss curve

read txt data

Read the data My Loss data looks like this, there are 250 rows or 250.
insert image description here

the code

import numpy as np
from ast import literal_eval
import matplotlib.pyplot as plt
import os

data_lines1 = "input_data/epoch_loss-yolo5.txt"  # 数据路径1
data_lines2 = "input_data/epoch_loss-yolo7.txt"  # 数据路径2
image_path = 'out_data'                          # Loss曲线保存路径
name = 'epoch_loss '                             # 保存图片的名字

def read_data(data_lines):              # 读取数据函数
    data_all = []
    with open(data_lines, "r") as f:    # 读txt文本数据
        data = f.readlines()
    for i in range(len(data)):
        data_all.append(float(data[i])) # 转float list
    return data_all

yolo5_loss = read_data(data_lines1)         # 读数据1
yolo7_loss = read_data(data_lines2)         # 读数据2

yolo5_x = np.arange(1,len(yolo5_loss)+1)    # 生成对应属于的x值
yolo7_x = np.arange(1,len(yolo7_loss)+1)    # 生成对应属于的x值

plt.xlabel("eporch")        # x轴标签
plt.ylabel("Loss")          # y轴标签
plt.title("Loss curve")     # 图片名字
plt.xlim(0.0,300.0)         # x轴量程
plt.ylim(0.0,1.05)          # y轴量程

plt.plot(yolo5_x,yolo5_loss,linewidth=2.0,linestyle='-')  # 画线1
plt.plot(yolo7_x,yolo7_loss,linewidth=2.0,linestyle='-')  # 画线2

plt.legend(labels=["yolo5","yolo7"], ncol=3)   # 设定线的标签

if not os.path.exists(image_path):  # 如果image文件夹不存在,则创建
    os.mkdir(image_path)            # 创建保存图片的文件夹

plt.savefig("{image_path}/{name}.png".format(image_path=image_path,name=name), dpi=300)  # 保存PR曲线图片
plt.show()  # 显示

The result is as follows:
insert image description here

Engineering code link

https://download.csdn.net/download/weixin_45464524/87835052?spm=1001.2014.3001.5503

Guess you like

Origin blog.csdn.net/weixin_45464524/article/details/130905181