Differential Evolution: A Survey of the State-of-the-Art

@

Das S, Suganthan P N. Differential Evolution: A Survey of the State-of-the-Art[J]. IEEE Transactions on Evolutionary Computation, 2011, 15(1): 4-31.

@article{das2011differential,
title={Differential Evolution: A Survey of the State-of-the-Art},
author={Das, Swagatam and Suganthan, P N},
journal={IEEE Transactions on Evolutionary Computation},
volume={15},
number={1},
pages={4--31},
year={2011}}

General

This is a review of Differential Evolution (DE). Since I am not familiar with this kind of method, I can only make a record.

main content

Consider the following questions,

\ [\ min \: f (X), \]

Where \ (X = (x_1, \ ldots, x_D) \) .

As far as I know, such as gradient descent method, Bayesian optimization can be used to deal with such problems, but there are also such as evolutionary algorithm (EA), evolutionary programming (EP), evolution strategies (ESs), genetic algorithm (GA), And the DE introduced in this article (the basics are not understood later).

DE / rand / 1 / bin

First give the original form, call it DE / rand / 1 / bin:

Input: scale factor \ (F \) , crossover rate \ (Cr \) , population size \ (NP \) .
1: Make \ (G = 0 \) , and initialize randomly \ (P_G = \ {X_ {1, G}, \ ldots, X_ {NP, G} \} \) .
2: While the stopping criterion is not satisfied Do:

  • For \(i=1,\ldots, NP\) do:
  1. Mutation step:

\[V_{i,G} = X_{r_1^i,G} + F \cdot (X_{r_2^i,G} - X_{r_3^i,G}). \]

  1. Crossover step : Generate as follows \ (U_ {i, G} = (u_ {1, i, G}, \ ldots, u_ {D, i, G}) \)

\[u_{j,i,G} = \left \{ \begin{array}{ll} v_{j,i,G} & if \: \mathrm{rand}[0,1] \le Cr \: or \: j=j_{rand} \\ x_{j,i,G} & otherwise. \end{array} \right. \]

  1. Selection step:

\[X_{i,G} = \left \{ \begin{array}{ll} U_{i,G} & if \: f(U_{i,G}) \le f(X_{i,G}) \\ X_{i,G} & otherwise. \end{array} \right. \]

  • End For.
  • \(G=G+1\).
    End While.

Where \ (X_ {i, G) = (x_ {j, i, G}, \ ldots, x_ {D, i, G}) \) , \ (j_ (rand) \) is a randomly generated \ ([1, D] \) integer, to ensure that \ (U \) is at least slightly changed relative to \ (X \) , \ (X_ {r_1 ^ i, G}, X_ {r_2 ^ i, G }, X_ (r_3 ^ i, G) \) are randomly selected from \ (P_G \) and are different.

In the following we can find many variants, and these variants are often variants of Mutation step and Crossover step.

OF/?/?/?

DE / rand / 1 / exp

This is a variant of the crossover step:

Randomly extract integers \ (n \) and \ (L \) from \ ([1, D] \) , then

\[u_{j,i,G} = \left \{ \begin{array}{ll} v_{j,i,G} & if \: n \le j \le n+L-1\\ x_{j,i,G} & otherwise. \end{array} \right. \]

\ (L \) can be generated by the following steps

  • \(L=0\)
  • while \(\mathrm{rand}[0,1] \le Cr\) and \(L\le D\):

\[L=L+1. \]

DE / best / 1

\[V_{i,G}=X_{best,G} + F\cdot (X_{r_1^i,G} - X_{r_2^i,G}), \]

Where \ (X_ {best, G) \) is the optimal point in \ (P_ {G} \) .

DE / best / 2

\[V_{i,G}=X_{best,G} + F\cdot (X_{r_1^i,G} - X_{r_2^i,G}) + F\cdot (X_{r_3^i,G} - X_{r_4^i,G}). \]

DE / rand / 2

\[V_{i,G}=X_{r_1^i,G} + F\cdot (X_{r_2^i,G} - X_{r_3^i,G}) + F\cdot (X_{r_4^i,G} - X_{r_5^i,G}), \]

Selection of hyperparameters

I really didn't take a closer look. The text roughly introduces a few places, and there is still a lot to check.

\ (F \) selection

Some recommended \ ([0.4, 1] \) (best 0.5), some recommended \ (0.6 \) , some recommended \ ([0.4, 0.95] \) (best 0.9).

There are also some adaptive options, such as

\[F = \left \{ \begin{array}{ll} \max \{l_{\min}, 1- |\frac{f_{\max}}{f_{\min}}|\} & if : |\frac{f_{\max}}{f_{\min}}|<1 \\ \max \{l_{\min}, 1- |\frac{f_{\max}}{f_{\min}}|\} & otherwise, \end{array} \right. \]

I am more puzzled. Isn't \ (| \ frac {f _ {\ max}} {f _ {\ min}} | \) not greater than or equal to 1?

\[F_{i,G+1} = \left \{ \begin{array}{ll} \mathrm{rand}[F_l, F_{u}] & with \: probability \:\tau \\ F_{i,G} & else, \end{array} \right. \]

Where \ (F_l \) and \ (F_u \) are the lower and upper bounds of the value of \ (F \) , respectively .

\ (NP \) selection

Some recommendations \ ([5D, 10D] \) , some recommendations \ ([3D, 8D] \) .

\ (Cr \) selection

Some recommendations \ ([0.3, 0.9] \) .

and also

\[Cr_{i,G+1} = \left \{ \begin{array}{ll} \mathrm{rand}[0, 1] & with \: probability \:\tau \\ Cr_{i,G} & else, \end{array} \right. \]

Some continuous variants

A

\[p=f(X_{r_1})+f(X_{r_2}) + f(X_{r_3}) \\ p_1 = f(X_{r_1})/p \\ p_2 = f(X_{r_2}) / p \\ p_3 = f(X_{r_1}) / p. \]

If \ (\ mathrm {rand} [0,1] <\ Gamma \) ( \ (\ Gamma \) is given):

\[\begin{array}{ll} V_{i,G+1} = & (X_{r_1}+X_{r_2}+X_{r_3})/3 +(p_2-p_1)(X_{r_1}-X_{r_2}) \\ &+ (p_3-p_2)(X_{r_2} - X_{r_3}) + (p_1-p_3) (X_{r_3}- X_{r_1}), \end{array} \]

otherwise

\[V_{i,G+1} = X_{r_1} + F \cdot (X_{r_2}-X_{r_3}). \]

B

\[U_{i,G}=X_{i, G}+k_i \cdot (X_{r_1,G}-X_{i,G})+F' \cdot (X_{r_2,G}-X_{r_3, G}), \]

Where \ (k_i \) is given, \ (F '= k_i \ cdot F \) .

C

Insert picture description here

D

That is, when considering \ (x \) , we also need to consider its inverse \ (a + bx \) , assuming \ (x \ in [a, b] \) , \ ([a, b] \) for us Fixed range, anti-similar construction of \ (X \) .

E

Insert picture description here
Among them, \ (X_ {n_ {best}, G} \) represents the best advantage among the nearest neighbors of \ (n \) in \ (X_ {i, G} \) , \ (p, q \ in [ik, i + k] \) . Among them \ (X_ {g_ {best}, G} \) is the best of \ (P_G \) .
Insert picture description here

\[V_{i,G}= w \cdot g_{i, G} + (1-w) \cdot L_{i, G}. \]

G

Insert picture description here

The rest of the applications in complex environments are not recorded (just talk about how to do it).

Some disadvantages

  1. High-dimensional problems are not easy to deal with;
  2. It is easy to be deceived by some problems, and now it is the local optimal solution;
  3. Not very good for functions that cannot be decomposed;
  4. The path is often not too large (that is, insufficient exploration);
  5. The lack of a theoretical guarantee of convergence.

Code

\(f(x,y)=x^2+50y^2\).
Insert picture description here

{
  "dim": 2,
  "F": 0.5,
  "NP": 5,
  "Cr": 0.35
}


"""
de.py
"""

import numpy as np
from scipy import stats
import random




class Parameter:

    def __init__(self, dim, xmin, xmax):
        self.dim = dim
        self.xmin = xmin
        self.xmax = xmax
        self.initial()


    def initial(self):
        self.para = stats.uniform.rvs(
            self.xmin, self.xmax - self.xmin
        )

    @property
    def data(self):
        return self.para

    def __getitem__(self, item):
        return self.para[item]

    def __setitem__(self, key, value):
        self.para[key] = value

    def __len__(self):
        return len(self.para)

    def __add__(self, other):
        return self.para + other

    def __mul__(self, other):
        return self.para * other

    def __pow__(self, power):
        return self.para ** power

    def __neg__(self):
        return -self.para

    def __sub__(self, other):
        return self.para - other

    def __truediv__(self, other):
        return self.para / other


class DE:

    def __init__(self, func, dim ,F=0.5, NP=50,
                 Cr=0.35, xmin=-10, xmax=10,
                 require_history=True):
        self.func = func
        self.dim = dim
        self.F = F
        self.NP = NP
        self.Cr = Cr
        self.xmin = np.array(xmin)
        self.xmax = np.array(xmax)
        assert all(self.xmin <= self.xmax), "Invalid xmin or xmax"
        self.require_history = require_history
        self.init_x()
        if self.require_history:
            self.build_history()


    def init_x(self):
        self.paras = [Parameter(self.dim, self.xmin, self.xmax)
                      for i in range(self.NP)]

    @property
    def data(self):
        return [para.data for para in self.paras]

    def build_history(self):
        self.paras_history = [self.data]

    def add_history(self):
        self.paras_history.append(self.data)

    def choose(self, size=3):
        return random.sample(self.paras, k=size)

    def mutation(self):
        x1, x2, x3 = self.choose(3)
        return x1 + self.F * (x2 - x3)

    def crossover(self, v, x):
        u = np.zeros_like(v)
        for i, _ in enumerate(v):
            jrand = random.randint(0, self.dim)
            if np.random.rand() < self.Cr or i is jrand:
                u[i] = v[i]
            else:
                u[i] = x[i]
            u[i] = v[i] if np.random.rand() < self.Cr else x[i]
        return u

    def selection(self, u, x):
        if self.func(u) < self.func(x):
            x.para = u
        else:
            pass

    def step(self):
        donors = [self.mutation()
                  for i in range(self.NP)]

        for i, donor in enumerate(donors):
            x = self.paras[i]
            u = self.crossover(donor, x)
            self.selection(u, x)
        if self.require_history:
            self.add_history()

    def multi_steps(self, times):
        for i in range(times):
            self.step()





class DEbest1(DE):

    def bestone(self):
        y = np.array([self.func(para)
             for para in self.paras])
        return self.paras[np.argmax(y)]

    def mutation(self, bestone):
        x1, x2 = self.choose(2)
        return bestone + self.F * (x1 - x2)

    def step(self):
        bestone = self.bestone()
        donors = [self.mutation(bestone)
                  for i in range(self.NP)]

        for i, donor in enumerate(donors):
            x = self.paras[i]
            u = self.crossover(donor, x)
            self.selection(u, x)
        if self.require_history:
            self.add_history()

class DEbest2(DEbest1):

    def mutation(self, bestone):
        x1, x2, x3, x4 = self.choose(4)
        return bestone + self.F * (x1 - x2) \
                + self.F * (x3 - x4)

class DErand2(DE):

    def mutation(self):
        x1, x2, x3, x4, x5 = self.choose(5)
        return x1 + self.F * (x2 - x3) \
                + self.F * (x4 - x5)


class DErandTM(DE):

    def mutation(self):
        x = self.choose(3)
        y = np.array(list(map(self.func, x)))
        p = y / y.sum()
        part1 = (x[0] + x[1] + x[2]) / 3
        part2 = (p[1] - p[0]) * (x[0] - x[1])
        part3 = (p[2] - p[1]) * (x[2] - x[1])
        part4 = (p[0] - p[2]) * (x[2] - x[0])
        return part1 + part2 + part3 + part4

Guess you like

Origin www.cnblogs.com/MTandHJ/p/12695069.html