Operation of GMOEA code
introduction
GMOEA is a paper published in IEEE Trans on Cybernetics in 2021 by the group of Professor Cheng Ran of Southern University of Science and Technology. Its main contribution is the application of GAN in MOP multi-objective evolution.
Paper link
Code link
However, EMI only released the algorithm itself and did not provide a script to solve the test set. Then I spent a day and night learning all the knowledge of python object-oriented programming. From the motivation of the paper, algorithm content, experimental results, environment construction, code running, and code interpretation, we will systematically explain it. I write this blog out of tears.
Generative Adversial Networks(GAN)
GAN is definitely the most representative work in the field of deep learning in the past five years. Specifically, I studied Teacher Li Mu’s videos and studied them carefully several times. Only then did I understand.
Learn AI from Li Mu-Paper Accuracy Series-GAN-Bizhan Video Link
The original text of GAN published in NIPS2014.pdf
Goodfellow Provide the source code of GAN
Motivation for GAN generation
When goodfellow was working on GAN in 2014, he felt that deep learning did a good job in identifying models, but not in generating models. The reason is that to generate data, it is necessary to fit the data distribution of the sample, which is difficult to calculate when maximizing the likelihood function, and the amount of calculation explodes as the dimension increases. So Goodfellow circumvents this difficulty and says that it learns a data distribution, and the effect is almost the same.
What is GAN
GAN is used as a generative model. In short, it uses a simple noise distribution p z ( z ) p_z(z) pz(z)Creating a little cry z z < /span>z. The original version of the original version x x xd numerical distribution p d a t a ( x ) p_{data}(x) pdata(x) on. There are two artificial neural networks competing against each other to promote learning. One is the generator G ( z , θ g ) G(z,\theta_g) G(z,ig)Input noise and generate fake samples x ˉ \bar{x} xˉ, there is a learnable parameter θ g \theta_g ig. Then there is a divider D ( x , θ d ) D(x,\theta_d) D(x,id)Shipping book x x xsum x ˉ \bar{x} xˉ Label 0-1 in advance and train a two-classifier to determine which samples come from the real data distribution and which ones come from the generator G.
Then GAN needs to train two neural networks G and D at the same time. When training D, G is fixed, and when training G, it has nothing to do with D. Then the value function is as follows. is a two-person minmax function. Represents a confrontational process. The value function V represents two expectations, and the whole is the expectation of D. First, the first part represents the sample x sampled from the real data distribution, input to the classifier D, and the output of the classifier should be as D(x)=1 as much as possible, plus log It is equal to 0. The second term is to input noise to the generator G(z) to generate false samples x ˉ \bar{x} xˉ, and then input it into the classifier D. If the classifier wants to classify it as 0, then the overall value is 0. If the classifier is imperfect and makes mistakes, the overall number is negative. So optimize the parameters of D θ d \theta_d id needs to maximize the entire expression.
To optimize the generator G, you need to make the second term as close to negative infinity as possible. That is to minimize G.
What GANs can do
GAN can be mapped to any distribution you want to generate through a simple data distribution. For example, mapping between pictures, changing faces with pictures, changing faces with videos, and changing sounds with music. Generate non-existent faces, generate non-existent flowers, etc.
GAN training process
First, two batches of length m are used, one from noise and one from real samples, to optimize the parameters of D. This process executes k steps.
Secondly, a noise of length m is sampled to train the generator G. Do this alternately. Until both neural networks are unable to make progress, a Nash equilibrium is reached.
a vivid metaphor
G is equivalent to the counterfeiter, and D is equivalent to the police. G’s counterfeiting skills were low at the beginning, but the police discovered it immediately through investigation. Then G began to improve his counterfeiting technology. At this time, the police discovered that the counterfeiting was even more powerful, so they also improved their anti-counterfeiting capabilities. In the end, during this confrontation, both parties made progress. In the end, the author hopes that the G counterfeiter will win. This achieves the goal.
GMOEA--Application of GAN in MOP
GMOEA’s motivation
It is said that the optimization framework of traditional MOP looks like this, and generating solutions mainly relies on heuristic methods. But heuristic methods sometimes have difficulty generating the desired solution.
For example, in the above situation, no matter how the two parents p1 and p2 intersect, it is difficult to generate a solution near the pareto solution set PS. So use machine learning models to replace heuristic generation methods.
Because the decision space in MOP can be regarded as a data distribution. Then PS is obtained by upsampling this data distribution. I want to use random noise distribution to fit the data distribution of non-dominated solutions. So use GAN to learn.
GMOEA framework
First initialize, determine the parameters, and then initialize the entire network to determine the training mode. The SPEA2 algorithm is then used to determine the non-dominated solution. The non-dominated solution Pr is determined as the real sample, and the non-dominated solution is Pf error sample. The classifier needs to classify the real samples Pr in the sample, the error samples Pf and the fake samples generated by the generator G(z) with noise. The trained network is then used to generate subpopulations. Then make your environment selection.
GMOEA’s network model
The generator has two fully connected layers and two hidden layers, and the discriminator has one fully connected layer and one hidden layer. The output is the sigmoid activation function. Use the mean and covariance matrix of the non-dominated solution to locate the noise distribution.
GMOEA model training
The first step is to calculate the mean and covariance matrix of Pr, and the second step is to select a batch of length m, T. Then remove the T. Then use the noise distribution to generate a sample Z. Used to train discriminator D. This step is performed k times. Train with the value function next to it. Sample another noise to train G.
GMOEA generates offspring
In order to prevent GAN from collapsing during the training process and affecting the quality of offspring generation. Therefore, the same selection probability is used as the traditional evolutionary operator GA. And they will all be used as training samples for GAN.
GAN generates offspring, first generating a D-dimensional vector x on the uniform distribution U(0,1). Then use the multivariate normal distribution to transform it onto y
, and then combine the upper and lower bounds to map it to the decision space.
Code running environment setup
First you need a GPU graphics card. The paper uses 1080Ti. Our laboratory is 3080Ti.
1. Install Anoconda
The mirror source in Tsinghua University can be downloaded. I downloaded the Anaconda3-2018-12-31-windows-x86_64 version. There is no need to install python first, because you can directly specify which version of python to install when creating a virtual environment.
2. Install cuda
There are tutorials on the Internet, and cuda11.3 is installed here. After configuring these NVIDIA things.
3. Install cudann
It is a library for CUDA neural networks. Need to be added to the cuda installation location.
4. Install pycharm
Because debugging is needed when running the code. Although the spyder that comes with anaconda is an editor similar to matlab. But personally I still think pycharm is easier to use. The automatic completion of code prompts is still very user-friendly. Just install pycharm casually. I installed the community version 2021.3.3.x64.
5. Create a virtual environment
6. Replace the project image source and installation package
Replace the image source
Install packages such as numpy and scipy
6. Activate the virtual environment
Enter your pycharm terminal window.
No response if activated.
Then you need to open powershell as an administrator and enter Set-ExecutionPolicy RemoteSigned to activate. If not, restart pycharm
.\activate
Please refer to this blog for details
Download the GPU version of pytorch
First of all, pytorch corresponds to the torch package, not called pytorch. Go directly to the pytorch official website. According to your configuration, select the corresponding pip download command. The URL below will automatically recommend commands to you based on your configuration or your own configuration.
https://pytorch.org/get-started/locally/
I choose pip to download according to my own configuration.
In the pycharm virtual environment I just activated, enter the pip command to download pytorch.
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
Then you need to download torchnn, scipy and other packages. For the same reason, you can directly pip.
6. Run the GMOEA code
Unzip the GMOEA code and copy it into the newly solved pycharm project. Since GMOEA only gives the algorithm itself and not the script to run. Then we need to write it ourselves.
First of all, I directly give the solution of IMF1 using GMOEA. The script I wrote after two days of exploration can be run.
from GMOEA import GMOEA
from IMMOEA_pro import*
from global_parameter import*
from EAreal import*
gp=GlobalParameter(d=30,operator=ea_real,pro=IMMOEA_F1)# problem表示求解问题的对象
G=GMOEA(gp=gp)
population, score=G.run()
print('finish')
First, import the GMOEA package, and the test question IMMOEA package, then the global parameter package, and the evolutionary algorithm package. Then create a new global parameter, input the decision variable dimension, and generate the subpopulation operator here, which is the ea_real class, and then the test problem IMMOEA_F1 class.
It is worth mentioning that in the global parameters here, entering a problem will create a problem object. The parameter input here is self.pro = pro(d=d, m=m,). The problem classes defined in IMMOEA_pro are inherited from the parent class test_problem. The constructor in the parent class gives the initialization value of ref_num, but because of the overridden constructor in the subclass, ref_num is not initialized but requires me to give it.
Although the author wrote in the code that the constructor of the subclass generates the parent class object for initialization. But when running the program here, if ref_num is not given, an error will be reported.
class IMMOEA_F1(test_problem):
def __init__(self, m, d, ref_num):
test_problem.__init__(self, m, d, ref_num)
So I modified the global parameters class. Directly follow the settings in the paper to set ref_num to 10000
class GlobalParameter:
"""
This class includes the general parameters for running the algorithm.
We can also define the class of population, which includes all the operations
"""
def __init__(self, m=2, n=100, d=3, eva=10000, decs=None, operator=None, pro=None, run=None):
self.m = m
self.n = n
self.pro = pro(d=d, m=m, ref_num=10000) # Initialize the class of problem
self.d = self.pro.d # objectives
self.lower = self.pro.lower
self.upper = self.pro.upper
self.boundary = (self.lower, self.upper)
self.eva = eva
self.operator = operator
self.result = decs
self.decs = []
self.run = run
Then right-click and run main. I chose to run it using python console. This allows you to observe variable values after running or during debugging. After running, the IGD value and the usage of the number of evaluations will be automatically output.
Code interpretation
First of allThis article explains the various operations in torch such as detach, variable, backforward and forward. This article is also very good
I directly paste my understanding and comments on GAN_model.py.
import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
import random
import numpy as np
class Generator(nn.Module):
# initializers
def __init__(self, d, n_noise): # 1-d vector
super(Generator, self).__init__()
self.linear1 = nn.Linear(n_noise, d, bias=True)
self.bn1 = nn.BatchNorm1d(d)
self.linear2 = nn.Linear(d, d, bias=True)
self.bn2 = nn.BatchNorm1d(d)
self.linear3 = nn.Linear(d, d, bias=True)
self.bn3 = nn.BatchNorm1d(d)
# forward method
def forward(self, noise):
x = torch.tanh(self.bn1(self.linear1(noise)))
x = torch.tanh(self.bn2(self.linear2(x)))
x = torch.sigmoid(self.bn3(self.linear3(x)))
return x
class Discriminator(nn.Module):# 高层API编程神经网络的方式,需要传入一个model
# initializers
def __init__(self, d):
super(Discriminator, self).__init__()# 和自定义模型一样,第一句话就是调用父类的构造函数
self.linear1 = nn.Linear(d, d, bias=True)
self.linear2 = nn.Linear(d, 1, bias=True)
# forward method
def forward(self, dec): #定义好网络模型开始训练
x = torch.tanh(self.linear1(dec)) #输入决策变量进入线形层,再输入激活函数进入隐含层
x = torch.sigmoid(self.linear2(x))#输入上一层的输出进入线形层,再输入激活函数作为输出
return x
class GAN(object):
def __init__(self, d, batchsize, lr, epoches, n_noise):
self.d = d
self.n_noise = n_noise
self.BCE_loss = nn.BCELoss()
self.G = Generator(self.d, self.n_noise)
self.D = Discriminator(self.d)
self.G.cuda() #cuda加速
self.D.cuda()
self.G_optimizer = optim.Adam(self.G.parameters(), 4*lr)
self.D_optimizer = optim.Adam(self.D.parameters(), lr)
self.epoches = epoches
self.batchsize = batchsize
def train(self, pop_dec, labels, samples_pool):
self.D.train() #将模块设置为训练模式
self.G.train()
n, d = np.shape(pop_dec)
indices = np.arange(n) #获取下标
center = np.mean(samples_pool, axis=0) #计算采样池的均值
cov = np.cov(samples_pool[:10, :].reshape((d, samples_pool[:10, :].size // d)))#计算采样池的协方差矩阵
iter_no = (n + self.batchsize - 1) // self.batchsize # 迭代次数等于种群数+batch-1整除batch 看有几个batch
for epoch in range(self.epoches): #最多迭代两百次
g_train_losses = 0
for iteration in range(iter_no): #训练每个batch
# train the D with real dataset
self.D.zero_grad() #初始化梯度为0
given_x = pop_dec[iteration * self.batchsize: (1 + iteration) * self.batchsize, :]
given_y = labels[iteration * self.batchsize: (1 + iteration) * self.batchsize]
batch_size = np.shape(given_x)[0]
# (Tensor, cuda, Variable)
given_x_ = Variable(torch.from_numpy(given_x).cuda()).float()
# 将决策变量转为torch的张量类型,使用GPU训练随后转为varibale类型。
given_y = Variable(torch.from_numpy(given_y).cuda()).float()
d_results_real = self.D(given_x_.detach())#调用variable的detach函数 从当前变量分离,求x的梯度
# train the D with fake data
# 在多元高斯噪声上产生噪声变量
fake_x = np.random.multivariate_normal(center, cov, batch_size)
# 修正产生的噪声在0-1的范围
fake_x = torch.from_numpy(np.maximum(np.minimum(fake_x, np.ones((batch_size, self.d))),
np.zeros((batch_size, self.d))))
# 生成一个标签标注是0,.cuda表示在cuda上定义一个张量
fake_y = Variable(torch.zeros((batch_size, 1)).cuda())
fake_x_ = Variable(fake_x.cuda()).float()#torch 转variable
g_results = self.G(fake_x_.detach())#声明varibale对象不需要梯度,也就是在这个地方不能继续反馈不能求导。
d_results_fake = self.D(g_results)#随后再用G生成的结果输入D进行训练
#\符号表示当前行继续到下一行,BCE损失函数表示交叉熵损失函数
d_train_loss = self.BCE_loss(d_results_real, given_y) + \
self.BCE_loss(d_results_fake, fake_y) # vanilla GAN
d_train_loss.backward() #反向传播梯度信息更新参数
self.D_optimizer.step() #优化器更新所有参数
# train the G with fake data
self.G.zero_grad()#初始化梯度
fake_x = np.random.multivariate_normal(center, cov, batch_size)#生成一个噪声
fake_x = torch.from_numpy(np.maximum(np.minimum(fake_x, np.ones((batch_size, self.d))),
np.zeros((batch_size, self.d))))#修正其上下界
fake_x_ = Variable(fake_x.cuda()).float() #转换成一个张量
fake_y = Variable(torch.ones((batch_size, 1)).cuda())#定义其标签
g_results = self.G(fake_x_) #用G产生一个数
d_results = self.D(g_results) #再用G的值来让分辨器分类
g_train_loss = self.BCE_loss(d_results, fake_y) # vanilla GAN loss #来计算G的损失函数
g_train_loss.backward()#反馈训练
self.G_optimizer.step()#更新参数
g_train_losses += g_train_loss.cpu()#G的损失值更新
# after each epoch, shuffle the dataset
random.shuffle(indices)
pop_dec = pop_dec[indices, :]
def generate(self, sample_noises, population_size):
self.G.eval() # set to eval mode
center = np.mean(sample_noises, axis=0).T # mean value
cov = np.cov(sample_noises.T) # convariance
batch_size = population_size
noises = np.random.multivariate_normal(center, cov, batch_size)
noises = torch.from_numpy(np.maximum(np.minimum(noises, np.ones((batch_size, self.d))),
np.zeros((batch_size, self.d))))
decs = self.G(Variable(noises.cuda()).float()).cpu().data.numpy()
return decs
def discrimate(self, off):
self.D.eval() # set to eval mode
batch_size = off.shape[0]
off = off.reshape(batch_size, 1, off.shape[1])
x = Variable(torch.from_numpy(off).cuda(), volatile=True).float()
d_results = self.D(x).cpu().data.numpy()
return d_results.reshape(batch_size)