GMOEA code operation 2--Building and running the operating environment

introduction

GMOEA is a paper published in IEEE Trans on Cybernetics in 2021 by the group of Professor Cheng Ran of Southern University of Science and Technology. Its main contribution is the application of GAN in MOP multi-objective evolution.
Paper link
Code link
However, EMI only released the algorithm itself and did not provide a script to solve the test set. Then I spent a day and night learning all the knowledge of python object-oriented programming. From the motivation of the paper, algorithm content, experimental results, environment construction, code running, and code interpretation, we will systematically explain it. I write this blog out of tears.

Generative Adversial Networks(GAN)

GAN is definitely the most representative work in the field of deep learning in the past five years. Specifically, I studied Teacher Li Mu’s videos and studied them carefully several times. Only then did I understand.
Learn AI from Li Mu-Paper Accuracy Series-GAN-Bizhan Video Link
The original text of GAN published in NIPS2014.pdf
Goodfellow Provide the source code of GAN

Motivation for GAN generation

When goodfellow was working on GAN in 2014, he felt that deep learning did a good job in identifying models, but not in generating models. The reason is that to generate data, it is necessary to fit the data distribution of the sample, which is difficult to calculate when maximizing the likelihood function, and the amount of calculation explodes as the dimension increases. So Goodfellow circumvents this difficulty and says that it learns a data distribution, and the effect is almost the same.

What is GAN

GAN is used as a generative model. In short, it uses a simple noise distribution p z ( z ) p_z(z) pz(z)Creating a little cry z z < /span>z. The original version of the original version x x xd numerical distribution p d a t a ( x ) p_{data}(x) pdata(x) on. There are two artificial neural networks competing against each other to promote learning. One is the generator G ( z , θ g ) G(z,\theta_g) G(z,ig)Input noise and generate fake samples x ˉ \bar{x} xˉ, there is a learnable parameter θ g \theta_g ig. Then there is a divider D ( x , θ d ) D(x,\theta_d) D(x,id)Shipping book x x xsum x ˉ \bar{x} xˉ Label 0-1 in advance and train a two-classifier to determine which samples come from the real data distribution and which ones come from the generator G.

Then GAN needs to train two neural networks G and D at the same time. When training D, G is fixed, and when training G, it has nothing to do with D. Then the value function is as follows. is a two-person minmax function. Represents a confrontational process. The value function V represents two expectations, and the whole is the expectation of D. First, the first part represents the sample x sampled from the real data distribution, input to the classifier D, and the output of the classifier should be as D(x)=1 as much as possible, plus log It is equal to 0. The second term is to input noise to the generator G(z) to generate false samples x ˉ \bar{x} xˉ, and then input it into the classifier D. If the classifier wants to classify it as 0, then the overall value is 0. If the classifier is imperfect and makes mistakes, the overall number is negative. So optimize the parameters of D θ d \theta_d id needs to maximize the entire expression.
Insert image description here
To optimize the generator G, you need to make the second term as close to negative infinity as possible. That is to minimize G.

What GANs can do

GAN can be mapped to any distribution you want to generate through a simple data distribution. For example, mapping between pictures, changing faces with pictures, changing faces with videos, and changing sounds with music. Generate non-existent faces, generate non-existent flowers, etc.

GAN training process

First, two batches of length m are used, one from noise and one from real samples, to optimize the parameters of D. This process executes k steps.

Secondly, a noise of length m is sampled to train the generator G. Do this alternately. Until both neural networks are unable to make progress, a Nash equilibrium is reached.
Insert image description here

a vivid metaphor

G is equivalent to the counterfeiter, and D is equivalent to the police. G’s counterfeiting skills were low at the beginning, but the police discovered it immediately through investigation. Then G began to improve his counterfeiting technology. At this time, the police discovered that the counterfeiting was even more powerful, so they also improved their anti-counterfeiting capabilities. In the end, during this confrontation, both parties made progress. In the end, the author hopes that the G counterfeiter will win. This achieves the goal.
Insert image description here

GMOEA--Application of GAN in MOP

GMOEA’s motivation

It is said that the optimization framework of traditional MOP looks like this, and generating solutions mainly relies on heuristic methods. But heuristic methods sometimes have difficulty generating the desired solution.
Insert image description here
Insert image description here
For example, in the above situation, no matter how the two parents p1 and p2 intersect, it is difficult to generate a solution near the pareto solution set PS. So use machine learning models to replace heuristic generation methods.

Because the decision space in MOP can be regarded as a data distribution. Then PS is obtained by upsampling this data distribution. I want to use random noise distribution to fit the data distribution of non-dominated solutions. So use GAN to learn.

GMOEA framework

First initialize, determine the parameters, and then initialize the entire network to determine the training mode. The SPEA2 algorithm is then used to determine the non-dominated solution. The non-dominated solution Pr is determined as the real sample, and the non-dominated solution is Pf error sample. The classifier needs to classify the real samples Pr in the sample, the error samples Pf and the fake samples generated by the generator G(z) with noise. The trained network is then used to generate subpopulations. Then make your environment selection.
Insert image description here

GMOEA’s network model

Insert image description here
The generator has two fully connected layers and two hidden layers, and the discriminator has one fully connected layer and one hidden layer. The output is the sigmoid activation function. Use the mean and covariance matrix of the non-dominated solution to locate the noise distribution.

GMOEA model training

The first step is to calculate the mean and covariance matrix of Pr, and the second step is to select a batch of length m, T. Then remove the T. Then use the noise distribution to generate a sample Z. Used to train discriminator D. This step is performed k times. Train with the value function next to it. Sample another noise to train G.
Insert image description here

GMOEA generates offspring

In order to prevent GAN from collapsing during the training process and affecting the quality of offspring generation. Therefore, the same selection probability is used as the traditional evolutionary operator GA. And they will all be used as training samples for GAN.

GAN generates offspring, first generating a D-dimensional vector x on the uniform distribution U(0,1). Then use the multivariate normal distribution to transform it onto y
Insert image description here
, and then combine the upper and lower bounds to map it to the decision space.
Insert image description here

Code running environment setup

First you need a GPU graphics card. The paper uses 1080Ti. Our laboratory is 3080Ti.

1. Install Anoconda

The mirror source in Tsinghua University can be downloaded. I downloaded the Anaconda3-2018-12-31-windows-x86_64 version. There is no need to install python first, because you can directly specify which version of python to install when creating a virtual environment.

2. Install cuda

There are tutorials on the Internet, and cuda11.3 is installed here. After configuring these NVIDIA things.

3. Install cudann

It is a library for CUDA neural networks. Need to be added to the cuda installation location.

4. Install pycharm

Because debugging is needed when running the code. Although the spyder that comes with anaconda is an editor similar to matlab. But personally I still think pycharm is easier to use. The automatic completion of code prompts is still very user-friendly. Just install pycharm casually. I installed the community version 2021.3.3.x64.

5. Create a virtual environment

Insert image description here

6. Replace the project image source and installation package

Replace the image source
Insert image description here
Insert image description here
Insert image description here
Install packages such as numpy and scipy
Insert image description here

6. Activate the virtual environment

Enter your pycharm terminal window.
Insert image description here
No response if activated.

Then you need to open powershell as an administrator and enter Set-ExecutionPolicy RemoteSigned to activate. If not, restart pycharm

.\activate

Please refer to this blog for details

Download the GPU version of pytorch

First of all, pytorch corresponds to the torch package, not called pytorch. Go directly to the pytorch official website. According to your configuration, select the corresponding pip download command. The URL below will automatically recommend commands to you based on your configuration or your own configuration.

https://pytorch.org/get-started/locally/

I choose pip to download according to my own configuration.
Insert image description here
In the pycharm virtual environment I just activated, enter the pip command to download pytorch.

pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

Then you need to download torchnn, scipy and other packages. For the same reason, you can directly pip.

6. Run the GMOEA code

Unzip the GMOEA code and copy it into the newly solved pycharm project. Since GMOEA only gives the algorithm itself and not the script to run. Then we need to write it ourselves.

First of all, I directly give the solution of IMF1 using GMOEA. The script I wrote after two days of exploration can be run.

from GMOEA import GMOEA
from IMMOEA_pro import*
from global_parameter import*
from EAreal import*

gp=GlobalParameter(d=30,operator=ea_real,pro=IMMOEA_F1)# problem表示求解问题的对象
G=GMOEA(gp=gp)
population, score=G.run()
print('finish')

First, import the GMOEA package, and the test question IMMOEA package, then the global parameter package, and the evolutionary algorithm package. Then create a new global parameter, input the decision variable dimension, and generate the subpopulation operator here, which is the ea_real class, and then the test problem IMMOEA_F1 class.

It is worth mentioning that in the global parameters here, entering a problem will create a problem object. The parameter input here is self.pro = pro(d=d, m=m,). The problem classes defined in IMMOEA_pro are inherited from the parent class test_problem. The constructor in the parent class gives the initialization value of ref_num, but because of the overridden constructor in the subclass, ref_num is not initialized but requires me to give it.

Although the author wrote in the code that the constructor of the subclass generates the parent class object for initialization. But when running the program here, if ref_num is not given, an error will be reported.

class IMMOEA_F1(test_problem):
    def __init__(self, m, d, ref_num):
        test_problem.__init__(self, m, d, ref_num)

So I modified the global parameters class. Directly follow the settings in the paper to set ref_num to 10000

class GlobalParameter:
    """
    This class includes the general parameters for running the algorithm.
    We can also define the class of population, which includes all the operations
    """
    def __init__(self, m=2, n=100, d=3, eva=10000, decs=None, operator=None, pro=None, run=None):
        self.m = m
        self.n = n
        self.pro = pro(d=d, m=m, ref_num=10000)     # Initialize the class of problem
        self.d = self.pro.d                         # objectives
        self.lower = self.pro.lower
        self.upper = self.pro.upper
        self.boundary = (self.lower, self.upper)
        self.eva = eva
        self.operator = operator
        self.result = decs
        self.decs = []
        self.run = run

Then right-click and run main. I chose to run it using python console. This allows you to observe variable values ​​after running or during debugging. After running, the IGD value and the usage of the number of evaluations will be automatically output.
Insert image description here

Code interpretation

First of allThis article explains the various operations in torch such as detach, variable, backforward and forward. This article is also very good

I directly paste my understanding and comments on GAN_model.py.

import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
import random
import numpy as np


class Generator(nn.Module):
    # initializers
    def __init__(self, d, n_noise):  # 1-d vector
        super(Generator, self).__init__()
        self.linear1 = nn.Linear(n_noise, d, bias=True)
        self.bn1 = nn.BatchNorm1d(d)
        self.linear2 = nn.Linear(d, d, bias=True)
        self.bn2 = nn.BatchNorm1d(d)
        self.linear3 = nn.Linear(d, d, bias=True)
        self.bn3 = nn.BatchNorm1d(d)

    # forward method
    def forward(self, noise):
        x = torch.tanh(self.bn1(self.linear1(noise)))
        x = torch.tanh(self.bn2(self.linear2(x)))
        x = torch.sigmoid(self.bn3(self.linear3(x)))
        return x


class Discriminator(nn.Module):# 高层API编程神经网络的方式,需要传入一个model
    # initializers
    def __init__(self, d):
        super(Discriminator, self).__init__()# 和自定义模型一样,第一句话就是调用父类的构造函数
        self.linear1 = nn.Linear(d, d, bias=True)
        self.linear2 = nn.Linear(d, 1, bias=True)

    # forward method
    def forward(self, dec): #定义好网络模型开始训练
        x = torch.tanh(self.linear1(dec)) #输入决策变量进入线形层,再输入激活函数进入隐含层
        x = torch.sigmoid(self.linear2(x))#输入上一层的输出进入线形层,再输入激活函数作为输出
        return x


class GAN(object):
    def __init__(self, d, batchsize, lr, epoches, n_noise):
        self.d = d
        self.n_noise = n_noise
        self.BCE_loss = nn.BCELoss()
        self.G = Generator(self.d, self.n_noise)
        self.D = Discriminator(self.d)
        self.G.cuda() #cuda加速
        self.D.cuda()
        self.G_optimizer = optim.Adam(self.G.parameters(), 4*lr)
        self.D_optimizer = optim.Adam(self.D.parameters(), lr)
        self.epoches = epoches
        self.batchsize = batchsize

    def train(self, pop_dec, labels, samples_pool):
        self.D.train() #将模块设置为训练模式
        self.G.train()
        n, d = np.shape(pop_dec)
        indices = np.arange(n) #获取下标

        center = np.mean(samples_pool, axis=0) #计算采样池的均值
        cov = np.cov(samples_pool[:10, :].reshape((d, samples_pool[:10, :].size // d)))#计算采样池的协方差矩阵
        iter_no = (n + self.batchsize - 1) // self.batchsize # 迭代次数等于种群数+batch-1整除batch 看有几个batch

        for epoch in range(self.epoches): #最多迭代两百次
            g_train_losses = 0

            for iteration in range(iter_no): #训练每个batch

                # train the D with real dataset
                self.D.zero_grad() #初始化梯度为0
                given_x = pop_dec[iteration * self.batchsize: (1 + iteration) * self.batchsize, :]
                given_y = labels[iteration * self.batchsize: (1 + iteration) * self.batchsize]
                batch_size = np.shape(given_x)[0]

                # (Tensor, cuda, Variable)
                given_x_ = Variable(torch.from_numpy(given_x).cuda()).float()
                # 将决策变量转为torch的张量类型,使用GPU训练随后转为varibale类型。
                given_y = Variable(torch.from_numpy(given_y).cuda()).float()
                d_results_real = self.D(given_x_.detach())#调用variable的detach函数 从当前变量分离,求x的梯度

            # train the D with fake data
                # 在多元高斯噪声上产生噪声变量
                fake_x = np.random.multivariate_normal(center, cov, batch_size)
                # 修正产生的噪声在0-1的范围
                fake_x = torch.from_numpy(np.maximum(np.minimum(fake_x, np.ones((batch_size, self.d))),
                                                         np.zeros((batch_size, self.d))))
                # 生成一个标签标注是0,.cuda表示在cuda上定义一个张量
                fake_y = Variable(torch.zeros((batch_size, 1)).cuda())
                fake_x_ = Variable(fake_x.cuda()).float()#torch 转variable
                g_results = self.G(fake_x_.detach())#声明varibale对象不需要梯度,也就是在这个地方不能继续反馈不能求导。
                d_results_fake = self.D(g_results)#随后再用G生成的结果输入D进行训练
                #\符号表示当前行继续到下一行,BCE损失函数表示交叉熵损失函数
                d_train_loss = self.BCE_loss(d_results_real, given_y) + \
                               self.BCE_loss(d_results_fake, fake_y)  # vanilla  GAN
                d_train_loss.backward() #反向传播梯度信息更新参数
                self.D_optimizer.step() #优化器更新所有参数

                # train the G with fake data
                self.G.zero_grad()#初始化梯度
                fake_x = np.random.multivariate_normal(center, cov, batch_size)#生成一个噪声
                fake_x = torch.from_numpy(np.maximum(np.minimum(fake_x, np.ones((batch_size, self.d))),
                                                     np.zeros((batch_size, self.d))))#修正其上下界
                fake_x_ = Variable(fake_x.cuda()).float() #转换成一个张量
                fake_y = Variable(torch.ones((batch_size, 1)).cuda())#定义其标签
                g_results = self.G(fake_x_) #用G产生一个数
                d_results = self.D(g_results) #再用G的值来让分辨器分类
                g_train_loss = self.BCE_loss(d_results, fake_y)   # vanilla GAN loss #来计算G的损失函数
                g_train_loss.backward()#反馈训练
                self.G_optimizer.step()#更新参数
                g_train_losses += g_train_loss.cpu()#G的损失值更新
            # after each epoch, shuffle the dataset
            random.shuffle(indices)
            pop_dec = pop_dec[indices, :]

    def generate(self, sample_noises, population_size):

        self.G.eval()  # set to eval mode

        center = np.mean(sample_noises, axis=0).T  # mean value
        cov = np.cov(sample_noises.T)   # convariance
        batch_size = population_size

        noises = np.random.multivariate_normal(center, cov, batch_size)
        noises = torch.from_numpy(np.maximum(np.minimum(noises, np.ones((batch_size, self.d))),
                                                      np.zeros((batch_size, self.d))))
        decs = self.G(Variable(noises.cuda()).float()).cpu().data.numpy()
        return decs

    def discrimate(self, off):

        self.D.eval()  # set to eval mode
        batch_size = off.shape[0]
        off = off.reshape(batch_size, 1, off.shape[1])

        x = Variable(torch.from_numpy(off).cuda(), volatile=True).float()
        d_results = self.D(x).cpu().data.numpy()
        return d_results.reshape(batch_size)



Guess you like

Origin blog.csdn.net/qq_36820823/article/details/123743602