Genetic Algorithm Finding the Most Value of Function

Table of contents

1. About genetic algorithm

2. Steps of Genetic Algorithm

3. Code implementation

3.1 Utility functions

3.1.1 Objective function

3.1.2 Decoding

3.1.3 Crossover

3.1.4 Variation

3.2 Main function part

3.3 code

4. Other


1. About genetic algorithm

Genetic Algorithm is an algorithm for calculating the optimal solution based on the theory of biological evolution. The core idea is natural selection and survival of the fittest.

There are many explanations about genetic algorithms on the Internet. This chapter will use python to implement genetic algorithms to calculate the maximum value of functions , and discuss the details

The function used in this chapter is: y = x^2, the domain of definition is 0-10

The genetic algorithm uses genetic recombination of genes, so each gene here is a binary sequence, such as 10101

The binary mapping back to the domain is the decoding process, start + (end - start) * tmp / (pow(2,length)-1)  , start is the left value of the domain, end is the rvalue of the domain, tmp is binary sequence, length is the length of the binary sequence. Through such calculations, the binary can be mapped back to the given domain of definition

So how is the length of a binary sequence defined?

It is based on the calculation precision, for example, the precision to be calculated here is 0.01. Then you need a total of 10*10^2 = 1000 numbers between 0.01->0.02->...->10.00, then the required binary is 10, because 2^10 = 1024 can save 1000 numbers. Then the binary sequence length is 10

2. Steps of Genetic Algorithm

Next, look at the steps of the genetic algorithm:

Here is just a step-by-step analysis based on the code implemented by myself. Some details or differences from the online implementation will be added at the end

1. Initialize the population : Because the generated binary sequence is generally not one, this is for better crossover operations below. Moreover, multiple binary sequences are also beneficial to find the optimal solution

2. Decoding : Decoding means mapping the generated binary sequence to the corresponding domain, because the value generated by binary is very large. For example, in the above example, the domain is 0-10 and the precision is 0.01, so the binary length is 10. If there is no mapping, any binary will exceed the domain of definition 10

When the decoding is completed, the initialization population has all become random n (the number of binary sequences generated) points in the domain of definition

3. Calculation of fitness : Calculate the randomly decoded points in the definition domain. In the example, y = x^2 is called the fitness function

Calculating the fitness is actually calculating the y corresponding to these values ​​(x). For example, if the maximum value is required, then look at the randomly generated x, which y is larger, and then operate on the larger one. If this operation is repeated, the maximum value can be found

4. Roulette selects the survival probability of the paternal line : When n points are randomly generated, after the fitness function is calculated, the corresponding n function values ​​will be generated. The larger the function value, the better we say he is (we want to calculate the maximum value), then the probability that he should survive is greater . The corresponding method is roulette. For example, the generated y values ​​are: 2 and 9, then 2/(2+9) and 9/(2+9) are the probability of 2 and 9 surviving. This is roulette.

Note: This is just a probability, not an absolute 9 > 2, it must be 9 survival 

5. Crossover : Crossover is the operation of exchanging random bits according to the binary of the paternal line, so that the idea of ​​inheritance appears.

6. Mutation : Mutation is to jump out of the extreme value, and randomly invert the binary code of the offspring. 0->1,1->0

The specific implementation method is explained in the code

3. Code implementation

As shown in the figure, the genetic algorithm in this chapter is to calculate the maximum value of y = x^2

The definition of genetic algorithm is here:

 

Among them, mutation_rate is the probability of binary sequence mutation, which should not be too large, or the offspring will be completely different from the parent, then the genetic algorithm will lose its meaning

parents_rate is the probability saved in the parent generation. For example, there are 10 populations in total, and 0.3 will save 3 parents. The saving method here is realized by roulette

The effect achieved is:

 

3.1 Utility functions

For the modularization of the code, there are four functions stored in utils here

3.1.1 Objective function

is the fitness function

# 目标函数
def function(x):
    # y = np.sin(x) * np.exp(-x)
    y = x**2
    return y

3.1.2 Decoding

decode is to encode according to the passed binary sequence matrix bit_matrix (n*m, n is the number of populations, m is the length of binary), and generate n decimal arguments in the start-end domain

# 将二进制编码为十进制,并映射到定义域中
def decode(bit_matrix,num_group,start,end,length):
    ret = np.zeros(num_group)
    temp = []       # 保存转换的十进制数
    for i in range(num_group):
        tmp = int(''.join(map(lambda x:str(x),bit_matrix[i])),2)    # 获得每一条染色体的十进制
        ret[i] = start + (end - start) * tmp / (pow(2,length)-1)        # 映射回原始的定义域
        temp.append(tmp)
    return temp,ret

3.1.3 Crossover

The method implemented here is different

In the genetic algorithm implemented in this chapter, the number of populations is fixed  , that is to say, the initialization is 10, then after the parent retains 3, the offspring generated by the crossover is only 7, which is the count variable in the code.

parents_groups is all the populations of the parent, not the 3 after retention

The way to realize the crossover is to cross two sets of random binary sequences. last return

# 交叉繁殖
def cross(count,parents_groups,length,cross_num=2):
    childen = []                        # 子代

    while len(childen) != count:       # 保证子代的数量和父代一样
        index = np.random.choice(np.arange(length),cross_num,replace=False)   # 随机交换cross_num个基因
        male = parents_groups[np.random.randint(0,len(parents_groups+1))]       # 从父代中随机挑选两个交叉繁殖
        female = parents_groups[np.random.randint(0,len(parents_groups+1))]

        childen_one = male.copy()
        childen_two = female.copy()

        childen_one[index] = female[index]          # 交换父母双方的基因产生两个子代
        childen.append(childen_one)

        if len(childen) == count:
            break

        childen_two[index] = male[index]
        childen.append(childen_two)
    return np.array(childen)

3.1.4 Variation

Mutation is for the population to produce mutations, so that randomly generated new offspring may be able to jump out of extreme values

The implementation is also very simple, num_mutation can control the number of binary mutations

# 变异
def mutation(children,mutation_rate,length,num_mutation=1):
    children_mutation = []
    for i in range(len(children)):
        tmp = children[i]
        if np.random.random() < mutation_rate:
            index = np.random.choice(np.arange(length),num_mutation,replace=False)

            for j in range(num_mutation):       # 变异
                if tmp[index[j]] == 1:
                    tmp[index[j]] = 0
                else:
                    tmp[index[j]]= 1
        children_mutation.append(tmp)

    return np.array(children_mutation)

3.2 Main function part

There are a few points to pay attention to. When calculating the fitness, perform the following operations on it, otherwise an error will be reported when the roulette is selected. Because probabilities cannot be negative

 

3.3 code

Main function part:

import numpy as np
import matplotlib.pyplot as plt
from utils import decode,function,cross,mutation


# 设定超参数
start,end = 0,10
length = 10                     # 染色体长度 bit,精度
num_group = 10                  # 种群数量
iteration_time = 2000             # 迭代次数
mutation_rate = 0.1             # 变异率
parents_rate = 0.3              # 父代中的保存个数(概率)


# 初始化二进制种群
init_group = np.random.randint(0,2,size=(num_group,length))

parents_group = init_group      # 父代

# 迭代
decode_parents_group = 0
for i in range(iteration_time):

    # 将二进制种群转为十进制,并映射到定义域中
    _, decode_parents_group = decode(bit_matrix=parents_group, num_group=num_group, start=start, end=end, length=length)

    # 计算种群适应度
    f = function(decode_parents_group)
    f = (f - np.min(f))+1e-8     # 防止 f 为负值或 0

    select = np.random.choice(np.arange(num_group),int(num_group*parents_rate),replace=True,p=f/sum(f))
    best_parents_group = parents_group[select]       # 父代中的保留

    count = len(parents_group) - len(best_parents_group)     # 计算差值

    # 交叉繁殖
    children = cross(count=count, parents_groups=parents_group, length=length)
    children = np.concatenate((best_parents_group, children))
    # 变异
    children = mutation(children=children,mutation_rate=mutation_rate,length=length)

    parents_group = children

fun = function(decode_parents_group)
x = np.linspace(start,end,100)
plt.plot(x,function(x),color='r')
plt.scatter(decode_parents_group,function(decode_parents_group))
plt.title('max is :%.4f' % np.max(fun))
plt.show()

utils section:

import numpy as np


# 目标函数
def function(x):
    # y = np.sin(x) * np.exp(-x)
    y = x**2
    return y


# 将二进制编码为十进制,并映射到定义域中
def decode(bit_matrix,num_group,start,end,length):
    ret = np.zeros(num_group)
    temp = []       # 保存转换的十进制数
    for i in range(num_group):
        tmp = int(''.join(map(lambda x:str(x),bit_matrix[i])),2)    # 获得每一条染色体的十进制
        ret[i] = start + (end - start) * tmp / (pow(2,length)-1)        # 映射回原始的定义域
        temp.append(tmp)
    return temp,ret


# 交叉繁殖
def cross(count,parents_groups,length,cross_num=2):
    childen = []                        # 子代

    while len(childen) != count:       # 保证子代的数量和父代一样
        index = np.random.choice(np.arange(length),cross_num,replace=False)   # 随机交换cross_num个基因
        male = parents_groups[np.random.randint(0,len(parents_groups+1))]       # 从父代中随机挑选两个交叉繁殖
        female = parents_groups[np.random.randint(0,len(parents_groups+1))]

        childen_one = male.copy()
        childen_two = female.copy()

        childen_one[index] = female[index]          # 交换父母双方的基因产生两个子代
        childen.append(childen_one)

        if len(childen) == count:
            break

        childen_two[index] = male[index]
        childen.append(childen_two)
    return np.array(childen)


# 变异
def mutation(children,mutation_rate,length,num_mutation=1):
    children_mutation = []
    for i in range(len(children)):
        tmp = children[i]
        if np.random.random() < mutation_rate:
            index = np.random.choice(np.arange(length),num_mutation,replace=False)

            for j in range(num_mutation):       # 变异
                if tmp[index[j]] == 1:
                    tmp[index[j]] = 0
                else:
                    tmp[index[j]]= 1
        children_mutation.append(tmp)

    return np.array(children_mutation)

4. Other

There are many places that are inconsistent with the implementation on the Internet, and there are some places that I don’t really understand

For example, when retaining the parent, can it be reserved repeatedly?

The method in this chapter is ok (it can be changed to False to not repeat), here I personally think that if you choose to keep the parent without repeating, then basically the reserved parent is reserved according to the probability value from large to small, so the initialization is not very When it's good, it's easy to fall into the pit of extreme value

    select = np.random.choice(np.arange(num_group),int(num_group*parents_rate),replace=True,p=f/sum(f))

For example, if the mutation rate or the number of mutations is too large, the information left by the parents to the offspring will be completely destroyed, and the meaning of inheritance will be lost

Here the result of calculating y = np.sin(x) * np.exp(-x) is:

 

Guess you like

Origin blog.csdn.net/qq_44886601/article/details/130342562