How to use the seaborn relplot function to imitate the drawing of a tensioned reinforcement learning top meeting article illustration

An illustration of someone else’s house

Insert picture description here

How to draw

relplot can automatically draw the confidence interval, so what we have to do is to save the results of multiple tests and ensure that there are multiple ordinates on one abscissa, and it can automatically draw the confidence interval. Suppose I ran a DDPG model, got a reward_list=[0,1,2,4,4,…], got a reward_list=[2,3,4,4…] for the second time, and ran again A SAC model, I got a reward_list=[0,1,2,4,5,...], and I got a reward_list=[2,3,4,6...] for the second time, using the model name and the number of iterations (index ), the reward value is used as the column name of the pandas dataframe to create a dataframe. Here I wrote a Painter class for convenience:

#!/usr/bin/python
# -*- coding: utf-8 -*-
# Time: 2021-3-19
# Author: ZYunfei
# File func: draw func

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.font_manager import FontProperties
myfont=FontProperties(fname=r'C:\Windows\Fonts\simsun.ttc')
sns.set(font=myfont.get_name())

class Painter:
    def __init__(self, load_csv, load_dir):
        if not load_csv:
            self.data = pd.DataFrame(columns=['episode reward','episode', 'Method'])
        else:
            self.load_dir = load_dir
            self.data = pd.read_csv(self.load_dir).iloc[:,1:] # csv文件第一列是index,不用取。

    def addData(self, dataSeries, method, smooth = True):
        if smooth:
            dataSeries = self.smooth(dataSeries)
        size = len(dataSeries)
        for i in range(size):
            dataToAppend = {
    
    'episode reward':dataSeries[i],'episode':i+1,'Method':method}
            self.data = self.data.append(dataToAppend,ignore_index = True)

    def drawFigure(self):
        sns.set_theme(style="darkgrid")
        sns.set_style(rc={
    
    "linewidth": 1})
        sns.relplot(data = self.data, kind = "line", x = "episode", y = "episode reward",
                    hue= "Method")
        plt.title(u'奖励随迭代回合数变化曲线',fontproperties = myfont,fontsize = 12)
        plt.xlabel(u"回合数",fontproperties = myfont)
        plt.ylabel(u"平均回合奖励值",fontproperties = myfont)
        plt.show()

    def saveData(self, save_dir):
        self.data.to_csv(save_dir)

    def addCsv(self, add_load_dir):
        """将另一个csv文件合并到load_dir的csv文件里。"""
        add_csv = pd.read_csv(add_load_dir).iloc[:,1:]
        self.data = pd.concat([self.data, add_csv],axis=0,ignore_index=True)

    def deleteData(self,delete_data_name):
        """删除某个method的数据,删除之后需要手动保存,不会自动保存。"""
        self.data = self.data[~self.data['Method'].isin([delete_data_name])]

    def smoothData(self, smooth_method_name,N):
        """对某个方法下的reward进行MA滤波,N为MA滤波阶数。"""
        begin_index = -1
        mode = -1  # mode为-1表示还没搜索到初始索引, mode为1表示正在搜索末尾索引。
        for i in range(len(self.data)):
            if self.data.iloc[i]['Method'] == smooth_method_name and mode == -1:
                begin_index = i
                mode = 1
                continue
            if mode == 1 and self.data.iloc[i]['episode'] == 1:
                self.data.iloc[begin_index:i,0] = self.smooth(
                    self.data.iloc[begin_index:i,0],N = N
                )
                begin_index = -1
                mode = -1
                if self.data.iloc[i]['Method'] == smooth_method_name:
                    begin_index = i
                    mode = 1
            if mode == 1 and i == len(self.data) - 1:
                self.data.iloc[begin_index:,0]= self.smooth(
                    self.data.iloc[begin_index:,0], N=N
                )



    @staticmethod
    def smooth(data,N=7):
        n = (N - 1) // 2
        res = np.zeros(len(data))
        for i in range(len(data)):
            if i <= n - 1:
                res[i] = sum(data[0:2 * i+1]) / (2 * i + 1)
            elif i < len(data) - n:
                res[i] = sum(data[i - n:i + n +1]) / (2 * n + 1)
            else:
                temp = len(data) - i
                res[i] = sum(data[-temp * 2 + 1:]) / (2 * temp - 1)
        return res



if __name__ == "__main__":
    painter = Painter(load_csv=True,load_dir='F:/MasterDegree/PytorchLearning/test.csv')
    painter.smoothData('Fully Decentralized DDPG',33)
    painter.drawFigure()

API introduction:

  1. Initialization: load_csv indicates whether to load an existing csv file. If it is True, write the file path in load_dir, and painter will automatically perform subsequent operations on the basis of the file.
  2. painter.addData: Add a reward array to the csv file, in order as episode=1, 2, .... Method is the model corresponding to this data (such as DDPG).
  3. painter.drawFigure: draws the currently read csv file without parameters.
  4. painter.addCsv: Add a csv file to the end of the current csv file. (This is not commonly used)
  5. painter.deletaData: Delete all data corresponding to a certain model name. The parameter is the model name (for example:'DDPG')
  6. painter.smoothData: MA filter the data corresponding to a model in the current csv file. The first parameter is the model name (for example:'DDPG'), and the second parameter is the order of MA filtering (odd number, for example 7) .
  7. painter.smooth: A static method, MA filtering.

Call example:

if __name__ == "__main__":
    painter = Painter(load_csv=True,load_dir='F:/MasterDegree/PytorchLearning/test.csv')
    painter.smoothData('Fully Decentralized DDPG',11)
    painter.smoothData('Fully Centralized DDPG', 11)
    painter.smoothData('MADDPG', 11)
    painter.drawFigure()

The more data the better the drawing effect, but the corresponding need to calculate a long confidence interval (a few minutes of drawing time), just wait patiently.
Insert picture description here

Reference article

https://zhuanlan.zhihu.com/p/75477750

Guess you like

Origin blog.csdn.net/weixin_43145941/article/details/115141565