@
@article{das2011differential,
title={Differential Evolution: A Survey of the State-of-the-Art},
author={Das, Swagatam and Suganthan, P N},
journal={IEEE Transactions on Evolutionary Computation},
volume={15},
number={1},
pages={4--31},
year={2011}}
General
This is a review of Differential Evolution (DE). Since I am not familiar with this kind of method, I can only make a record.
main content
Consider the following questions,
Where \ (X = (x_1, \ ldots, x_D) \) .
As far as I know, such as gradient descent method, Bayesian optimization can be used to deal with such problems, but there are also such as evolutionary algorithm (EA), evolutionary programming (EP), evolution strategies (ESs), genetic algorithm (GA), And the DE introduced in this article (the basics are not understood later).
DE / rand / 1 / bin
First give the original form, call it DE / rand / 1 / bin:
Input: scale factor \ (F \) , crossover rate \ (Cr \) , population size \ (NP \) .
1: Make \ (G = 0 \) , and initialize randomly \ (P_G = \ {X_ {1, G}, \ ldots, X_ {NP, G} \} \) .
2: While the stopping criterion is not satisfied Do:
- For \(i=1,\ldots, NP\) do:
- Mutation step:
- Crossover step : Generate as follows \ (U_ {i, G} = (u_ {1, i, G}, \ ldots, u_ {D, i, G}) \)
- Selection step:
- End For.
- \(G=G+1\).
End While.
Where \ (X_ {i, G) = (x_ {j, i, G}, \ ldots, x_ {D, i, G}) \) , \ (j_ (rand) \) is a randomly generated \ ([1, D] \) integer, to ensure that \ (U \) is at least slightly changed relative to \ (X \) , \ (X_ {r_1 ^ i, G}, X_ {r_2 ^ i, G }, X_ (r_3 ^ i, G) \) are randomly selected from \ (P_G \) and are different.
In the following we can find many variants, and these variants are often variants of Mutation step and Crossover step.
OF/?/?/?
DE / rand / 1 / exp
This is a variant of the crossover step:
Randomly extract integers \ (n \) and \ (L \) from \ ([1, D] \) , then
\ (L \) can be generated by the following steps
- \(L=0\)
- while \(\mathrm{rand}[0,1] \le Cr\) and \(L\le D\):
DE / best / 1
Where \ (X_ {best, G) \) is the optimal point in \ (P_ {G} \) .
DE / best / 2
DE / rand / 2
Selection of hyperparameters
I really didn't take a closer look. The text roughly introduces a few places, and there is still a lot to check.
\ (F \) selection
Some recommended \ ([0.4, 1] \) (best 0.5), some recommended \ (0.6 \) , some recommended \ ([0.4, 0.95] \) (best 0.9).
There are also some adaptive options, such as
I am more puzzled. Isn't \ (| \ frac {f _ {\ max}} {f _ {\ min}} | \) not greater than or equal to 1?
Where \ (F_l \) and \ (F_u \) are the lower and upper bounds of the value of \ (F \) , respectively .
\ (NP \) selection
Some recommendations \ ([5D, 10D] \) , some recommendations \ ([3D, 8D] \) .
\ (Cr \) selection
Some recommendations \ ([0.3, 0.9] \) .
and also
Some continuous variants
A
If \ (\ mathrm {rand} [0,1] <\ Gamma \) ( \ (\ Gamma \) is given):
otherwise
B
Where \ (k_i \) is given, \ (F '= k_i \ cdot F \) .
C
D
That is, when considering \ (x \) , we also need to consider its inverse \ (a + bx \) , assuming \ (x \ in [a, b] \) , \ ([a, b] \) for us Fixed range, anti-similar construction of \ (X \) .
E
Among them, \ (X_ {n_ {best}, G} \) represents the best advantage among the nearest neighbors of \ (n \) in \ (X_ {i, G} \) , \ (p, q \ in [ik, i + k] \) . Among them \ (X_ {g_ {best}, G} \) is the best of \ (P_G \) .
G
The rest of the applications in complex environments are not recorded (just talk about how to do it).
Some disadvantages
- High-dimensional problems are not easy to deal with;
- It is easy to be deceived by some problems, and now it is the local optimal solution;
- Not very good for functions that cannot be decomposed;
- The path is often not too large (that is, insufficient exploration);
- The lack of a theoretical guarantee of convergence.
Code
\(f(x,y)=x^2+50y^2\).
{
"dim": 2,
"F": 0.5,
"NP": 5,
"Cr": 0.35
}
"""
de.py
"""
import numpy as np
from scipy import stats
import random
class Parameter:
def __init__(self, dim, xmin, xmax):
self.dim = dim
self.xmin = xmin
self.xmax = xmax
self.initial()
def initial(self):
self.para = stats.uniform.rvs(
self.xmin, self.xmax - self.xmin
)
@property
def data(self):
return self.para
def __getitem__(self, item):
return self.para[item]
def __setitem__(self, key, value):
self.para[key] = value
def __len__(self):
return len(self.para)
def __add__(self, other):
return self.para + other
def __mul__(self, other):
return self.para * other
def __pow__(self, power):
return self.para ** power
def __neg__(self):
return -self.para
def __sub__(self, other):
return self.para - other
def __truediv__(self, other):
return self.para / other
class DE:
def __init__(self, func, dim ,F=0.5, NP=50,
Cr=0.35, xmin=-10, xmax=10,
require_history=True):
self.func = func
self.dim = dim
self.F = F
self.NP = NP
self.Cr = Cr
self.xmin = np.array(xmin)
self.xmax = np.array(xmax)
assert all(self.xmin <= self.xmax), "Invalid xmin or xmax"
self.require_history = require_history
self.init_x()
if self.require_history:
self.build_history()
def init_x(self):
self.paras = [Parameter(self.dim, self.xmin, self.xmax)
for i in range(self.NP)]
@property
def data(self):
return [para.data for para in self.paras]
def build_history(self):
self.paras_history = [self.data]
def add_history(self):
self.paras_history.append(self.data)
def choose(self, size=3):
return random.sample(self.paras, k=size)
def mutation(self):
x1, x2, x3 = self.choose(3)
return x1 + self.F * (x2 - x3)
def crossover(self, v, x):
u = np.zeros_like(v)
for i, _ in enumerate(v):
jrand = random.randint(0, self.dim)
if np.random.rand() < self.Cr or i is jrand:
u[i] = v[i]
else:
u[i] = x[i]
u[i] = v[i] if np.random.rand() < self.Cr else x[i]
return u
def selection(self, u, x):
if self.func(u) < self.func(x):
x.para = u
else:
pass
def step(self):
donors = [self.mutation()
for i in range(self.NP)]
for i, donor in enumerate(donors):
x = self.paras[i]
u = self.crossover(donor, x)
self.selection(u, x)
if self.require_history:
self.add_history()
def multi_steps(self, times):
for i in range(times):
self.step()
class DEbest1(DE):
def bestone(self):
y = np.array([self.func(para)
for para in self.paras])
return self.paras[np.argmax(y)]
def mutation(self, bestone):
x1, x2 = self.choose(2)
return bestone + self.F * (x1 - x2)
def step(self):
bestone = self.bestone()
donors = [self.mutation(bestone)
for i in range(self.NP)]
for i, donor in enumerate(donors):
x = self.paras[i]
u = self.crossover(donor, x)
self.selection(u, x)
if self.require_history:
self.add_history()
class DEbest2(DEbest1):
def mutation(self, bestone):
x1, x2, x3, x4 = self.choose(4)
return bestone + self.F * (x1 - x2) \
+ self.F * (x3 - x4)
class DErand2(DE):
def mutation(self):
x1, x2, x3, x4, x5 = self.choose(5)
return x1 + self.F * (x2 - x3) \
+ self.F * (x4 - x5)
class DErandTM(DE):
def mutation(self):
x = self.choose(3)
y = np.array(list(map(self.func, x)))
p = y / y.sum()
part1 = (x[0] + x[1] + x[2]) / 3
part2 = (p[1] - p[0]) * (x[0] - x[1])
part3 = (p[2] - p[1]) * (x[2] - x[1])
part4 = (p[0] - p[2]) * (x[2] - x[0])
return part1 + part2 + part3 + part4