Genetic Algorithm Model--Mathematical Modeling

Genetic algorithm is an optimization algorithm that imitates natural selection and genetic mechanism, and is mainly used to solve optimization problems. It simulates the process of inheritance, crossover and mutation in the process of biological evolution, and gradually searches for the global optimal solution by continuously evolving excellent individuals.

Before we start, let's understand a few concepts in genetic algorithms.

Concept 1: Genes and Chromosomes

In the genetic algorithm, we first need to map the problem to be solved into a mathematical problem, which is called "mathematical modeling", then a feasible solution to this problem is called a "chromosome". A feasible solution is generally composed of multiple elements, and each element is called a "gene" on the chromosome.

For example, for the following function, [1,2,3], [1,3,2], [3,2,1] are all feasible solutions of this function (substituting into it becomes a feasible solution), then These feasible solutions are called chromosomes in the genetic algorithm.

3x+4y+5z<100

These feasible solutions are composed of three elements, so in the genetic algorithm, each element is called a gene that makes up the chromosome.

Concept 2: Fitness Function

In nature, there seems to be a God who can select better individuals in each generation and eliminate some individuals with poor environmental fitness. So in the genetic algorithm, how to measure the pros and cons of chromosomes? This is what is done by the fitness function. The fitness function plays the role of this "God" in the genetic algorithm.

The genetic algorithm will perform N iterations during the running process, and each iteration will generate several chromosomes. The fitness function will score all the chromosomes generated in this iteration to judge the fitness of these chromosomes, and then eliminate the chromosomes with low fitness, and only keep the chromosomes with high fitness, so that after several iterations The quality of the later chromosomes will become better and better.

Concept 3: Crossover

Each iteration of the genetic algorithm will generate N chromosomes. In the genetic algorithm, each iteration is called an "evolution". So, how do the newly generated chromosomes come from each evolution? ——The answer is "crossover", which you can understand as mating.

The process of crossover needs to find two chromosomes from the chromosomes of the previous generation, one is the father and the other is the mother. A certain position of these two chromosomes is then cut off and spliced ​​together to create a new chromosome. This new chromosome contains both a certain number of father's genes and a certain number of mother's genes.

So, how to select the father's and mother's genes from the previous generation of chromosomes? This is not chosen randomly and is generally done with a roulette algorithm.

After each evolution is completed, the fitness of each chromosome is calculated, and then the fitness probability of each chromosome is calculated using the following formula. Then, during the crossover process, it is necessary to select the parental chromosomes according to this probability. Chromosomes with higher fitness have higher probability of being selected. This is why the genetic algorithm can retain good genes.

Probability of chromosome i being selected = fitness of chromosome i / sum of fitness of all chromosomes

Concept 4: Variation

Crossover can ensure that good genes are left in each evolution, but it only selects the original result set, and there are still a few genes, but their combination order is exchanged. This can only ensure that after N times of evolution, the calculation result is closer to the local optimal solution, but there is no way to reach the global optimal solution. In order to solve this problem, we need to introduce mutation.

Variation is well understood. When we generate a new chromosome through crossover, we need to randomly select several genes on the new chromosome, and then randomly modify the value of the gene, thus introducing new genes to the existing chromosome, breaking through the current search limit, and more It is beneficial for the algorithm to find the global optimal solution.

Concept 5: Replication

In each evolution, in order to retain the excellent chromosomes of the previous generation, the chromosomes with the highest fitness in the previous generation need to be directly copied to the next generation intact.

Assuming that each evolution needs to generate N chromosomes, then in each evolution, NM chromosomes need to be generated through crossover, and the remaining M chromosomes are obtained by copying the M chromosomes with the highest fitness in the previous generation.

The basic process of genetic algorithm is as follows:

  1. Initial population: Randomly generate a group of individuals as the population.
  2. Evaluate fitness: evaluate the fitness of each individual, and usually use the objective function to calculate the fitness of the individual.
  3. Selection operation: According to the fitness of each individual, select some individuals as parents to generate the next generation.
  4. Crossover operation: Perform crossover operation on parent individuals to generate new individuals.
  5. Mutation operation: Perform mutation operation on new individuals to generate more diversity.
  6. Evaluate new individuals: Evaluate the fitness of new individuals.
  7. Judging the termination condition: if the termination condition is satisfied, output the optimal solution; otherwise, return to step 3.

How many times does it need to evolve?

Each evolution will be better, so in theory, the more times of evolution, the better, but in practical applications, a balance point is often found between the accuracy of the results and the efficiency of execution. Generally, there are two ways.

1. Limit the number of evolutions

In some practical applications, the number of evolutions can be counted in advance. For example, if you find through a lot of experiments that no matter how the input data changes, the algorithm can get the optimal solution after N times of evolution, then you can set the number of times of evolution to N.

However, the actual situation is often not so ideal, and often different inputs will lead to a great difference in the number of iterations when the optimal solution is obtained. This is the second way you can consider.

2. Limit the allowed range

If the algorithm wants to achieve the global optimal solution, it may have to go through many, many times of evolution, which greatly affects the performance of the system. Then we can find a balance between the accuracy of the algorithm and the efficiency of the system. We can set an acceptable result range in advance. After the algorithm has undergone X evolutions, once the current result is found to be within the error range, the algorithm will be terminated.

But this method also has a disadvantage. In some cases, it may enter the error tolerance range after a few evolutions, but in some cases, it needs to evolve many, many, many times to enter the error tolerance range. This uncertainty makes the execution efficiency of the algorithm uncontrollable.

Therefore, which method to choose to control the number of iterations of the algorithm requires you to choose reasonably according to the specific business scenario. There is no universal way to give here, you need to find the answer in real practice.

Using Genetic Algorithm to Solve Load Balance Scheduling Problem

Algorithms are used to solve practical problems. So far, I think you have a comprehensive understanding of genetic algorithms. Next, we will use genetic algorithms to solve a practical problem-load balancing scheduling problem.

Assuming that there are N tasks, the load balancer needs to be assigned to M server nodes for processing. The task length of each task and the processing speed of each server node (hereinafter referred to as "node") are known. Please provide a task allocation method to make the total processing time of all tasks the shortest.

mathematical modeling

After getting this problem, we first need to map this actual problem into a mathematical model of genetic algorithm.

Task length matrix (referred to as: task matrix)

We represent the task length of all tasks in matrix tasks, such as:

Tasks={2,4,6,8}

Then, i in tasks[i] represents the number of the task, and tasks[i] represents the task length of task i.

Node processing speed matrix (abbreviation: node matrix)

We represent the processing speed of all server nodes with matrix nodes, such as:

Nodes={2,1}

Then, j in nodes[j] represents the number of the node, and nodes[j] represents the processing speed of node j.

Task processing time matrix

When  the task matrix Tasks and the node matrix Nodes are determined, the task processing time of all tasks assigned to all nodes can be determined. We use the matrix timeMatrix to represent it, which is a two-dimensional array:

1 2
2 4
3 6
4 8

timeMatrix[i][j] represents the time required to assign task i to node j for processing, which is calculated by the following formula:

timeMatrix[i][j] = tasks[i]/nodes[j]

chromosome

We know from the above that each evolution will produce N chromosomes, and each chromosome is a feasible solution to the current problem. The feasible solution is composed of multiple elements, and each element is called a gene of the chromosome. Next, we use a chromosome matrix to record the feasible solutions in each evolution process of the algorithm.

A chromosome consists of the following:

chromosome={1,2,3,4}

A chromosome is a one-bit array, the subscript of the one-bit array indicates the number of the task, and the value of the array indicates the number of the node. Then the meaning of chromosome[i]=j is: Assign task i to node j.

In the above example, the task set is Tasks={2,4,6,8}, and the node set is Nodes={2,1}, then the meaning of chromosome={3,2,1,0} is:

  • Assign task 0 to node 3
  • Assign task 1 to node 2
  • Assign task 2 to node 1
  • Assign task 3 to node 0

fitness matrix

It can be seen from the above that the fitness function plays the role of "God" in the genetic algorithm, which will judge the fitness of each chromosome, keep the chromosomes with high fitness, and eliminate the chromosomes with poor fitness. Then when implementing the algorithm, we need a fitness matrix to record the fitness of the current N chromosomes, as follows:

adaptability={0.6, 2, 3.2, 1.8}

The subscript of the adaptability array indicates the number of the chromosome, and adaptability[i] indicates the fitness of the chromosome numbered i.

In the example of load balancing scheduling, we use the total execution time of N tasks as the criterion for fitness evaluation. When all tasks are assigned, if the total duration is longer, then the fitness will be worse; and the shorter the total duration, the higher the fitness.

Choice probability matrix

It can be seen from the above that in each evolution process, the probability of each chromosome being selected in the next evolution needs to be calculated according to the fitness matrix. This matrix is ​​as follows:

selectionProbability={0.1, 0.4, 0.2, 0.3}

The subscript of the matrix indicates the number of the chromosome, and the value in the matrix indicates the selection probability corresponding to the chromosome. Its calculation formula is as follows:

selectionProbability[i] = adaptability[i] / sum of fitness

Implementation of Genetic Algorithm

After all the above-mentioned knowledge points are laid out, we can then upload the code. I believe that Talk is cheap, show you the code!
/**
 * 遗传算法
 * @param iteratorNum 迭代次数
 * @param chromosomeNum 染色体数量
 */
function gaSearch(iteratorNum, chromosomeNum) {
    // 初始化第一代染色体
    var chromosomeMatrix = createGeneration();

    // 迭代繁衍
    for (var itIndex=1; itIndex<iteratorNum; itIndex++) {
        // 计算上一代各条染色体的适应度
        calAdaptability(chromosomeMatrix);

        // 计算自然选择概率
        calSelectionProbability(adaptability);

        // 生成新一代染色体
        chromosomeMatrix = createGeneration(chromosomeMatrix);

    }
}

As soon as the code comes, everything is clear, and it seems that there is no need for too much explanation. The above is the main framework of the genetic algorithm, some of the details are encapsulated in each sub-function. After understanding the principle of the genetic algorithm, I don't think I need to explain too much about the code~ The complete code is on my Github, welcome to Star.

 The following is a sample code that uses Python to implement a genetic algorithm to solve the problem of the minimum value of a function of one variable:

import random

# 目标函数:f(x) = x^2
def objective_function(x):
    return x ** 2

# 生成随机个体
def generate_individual():
    return random.uniform(-10, 10)

# 计算个体适应度
def calculate_fitness(individual):
    return 1 / (1 + objective_function(individual))

# 选择操作
def selection(population):
    fitnesses = [calculate_fitness(individual) for individual in population]
    total_fitness = sum(fitnesses)
    probabilities = [fitness / total_fitness for fitness in fitnesses]
    selected = random.choices(population, weights=probabilities, k=len(population))
    return selected

# 交叉操作
def crossover(individual1, individual2):
    alpha = random.uniform(0, 1)
    new_individual1 = alpha * individual1 + (1 - alpha) * individual2
    new_individual2 = alpha * individual2 + (1 - alpha) * individual1
    return new_individual1, new_individual2

# 变异操作
def mutation(individual):
    new_individual = individual + random.uniform(-1, 1)
    return new_individual

# 遗传算法求解最小值问题
population_size = 100
population = [generate_individual() for i in range(population_size)]
num_generations = 1000
for generation in range(num_generations):
    # 选择操作
    selected_population = selection(population)
    
    # 交叉操作
    offspring_population = []
    for i in range(population_size):
        offspring1, offspring2 = crossover(selected_population[i], selected_population[(i+1) % population_size])
        offspring_population.append(offspring1)
        offspring_population.append(offspring2)
    
    # 变异操作
    for i in range(population_size):
        if random.uniform(0, 1) < 0.1:
            offspring_population[i] = mutation(offspring_population[i])
    
    #

Guess you like

Origin blog.csdn.net/qq_51533426/article/details/130517369