Genetic algorithm to solve urban TSP problem

1. Introduction to Genetic Algorithm and TSP

1. Genetic algorithm is a computational model that simulates the natural selection process of Darwin's biological evolution and the evolutionary mechanism of genetics, and is a method of searching for the optimal solution by simulating the natural evolution process. The genetic algorithm starts with a population that represents the potential solution set of the problem, and a population is composed of a certain number of individuals genetically encoded. Each individual is actually an entity with a characteristic chromosome. As the main carrier of genetic material, that is, the collection of multiple genes, the internal expression (ie, genotype) is a certain combination of genes, which determines the external expression of the individual's shape. Therefore, at the beginning, mapping from phenotype to genotype, ie coding, needs to be realized. After the generation of the first generation population, according to the principles of survival of the fittest and survival of the fittest, evolutionary generations produce better and better approximate solutions. In each generation, the body is selected according to the fitness of the individual in the problem domain, and the natural genetics Genetic operators combine crossovers and mutations to produce populations that represent new solution sets. This process will cause the offspring of the population to adapt to the environment more than the previous generation. The optimal individuals in the last population are decoded and can be used as the approximate optimal solution to the problem.
The basic process is shown in the figure below:
Insert picture description here
2. The traveling salesman problem, that is, the TSP problem (Traveling Salesman Problem) is also translated into the travel salesman problem and the salesman problem, which is one of the well-known problems in the field of mathematics. Suppose that a traveling businessman wants to visit n cities, he must choose the path he wants to take. The limitation of the path is that each city can only visit once, and finally return to the original departure city. The path selection goal is that the required path distance is the minimum value among all paths.

Second, the basic principle of genetic algorithm

Genetic algorithm is a type of randomized search method that borrows from the evolutionary laws of the biological world (survival of the fittest, survival of the fittest), its main feature is to directly operate on structural objects, there is no derivation and function continuity limit It has inherent implicit parallelism and better global optimization ability. It adopts a probabilistic optimization method, which can automatically obtain and guide the optimized search space, and adaptively adjust the search direction without the need for certain rules.

3. The main steps of genetic algorithm

1. Initial population: Several solutions are randomly generated within the feasible domain, and are called the initial population.
2. Evaluate the fitness of each solution in the population, and evaluate how well each individual adapts to the environment.
3. Survival of the fittest, elimination of the unsuitable, use of random numbers and other methods to eliminate unsuitable individuals, suitable for retention.
4. Disturb the cloud through cross-exchange, mutation, etc., in essence, search within the feasible domain to find the solution that is most suitable for the environment.
The above environment refers to the objective function, the function that needs to solve the optimal value, each individual is each feasible solution, fitness evaluation is to bring the feasible solution into the function for calculation and then compare its size, if the feasible solution is substituted into the function, the result is the most Excellent, then this solution is the optimal solution. Survival of the fittest is the retention of excellent solutions, elimination of poor solutions, crossover and mutation is to disturb the solutions to further improve the superiority of the solutions.

Fourth, the characteristics of genetic algorithms

1. Genetic algorithms only operate on individual genes, so no matter how complicated the actual problem is, its stability will not be greatly affected.
2. The search process of genetic algorithm belongs to parallel computing and can search the solution space well.
3. Strong stability and robustness, suitable for nonlinear, high-dimensional complex optimization problems.

Five, algorithm implementation code module

1. Even point drawing function plot_route.m

function plot_route(a,R)
scatter(a(:,1),a(:,2),'rx');
hold on;
plot([a(R(1),1),a(R(length(R)),1)],[a(R(1),2),a(R(length(R)),2)]);
hold on;
for i=2:length(R)
    x0=a(R(i-1),1);
    y0=a(R(i-1),2);
    x1=a(R(i),1);
    y1=a(R(i),2);
    xx=[x0,x1];
    yy=[y0,y1];
    plot(xx,yy);
    hold on;
end

end

2. Chromosome distance cost function mylength.m

function len=myLength(D,p)%p是一个排列
[N,NN]=size(D);
len=D(p(1,N),p(1,1));
for i=1:(N-1)
    len=len+D(p(1,i),p(1,i+1));
end
end

3. Fitness function fit.m

function fitness=fit(len,m,maxlen,minlen)
fitness=len;
for i=1:length(len)
    fitness(i,1)=(1-(len(i,1)-minlen)/(maxlen-minlen+0.0001)).^m;
end

4. Cross operation function cross.m

function [A,B]=cross(A,B)
L=length(A);
if L<10
    W=L;
elseif ((L/10)-floor(L/10))>=rand&&L>10
    W=ceil(L/10)+8;
else
    W=floor(L/10)+8;
end
%%W为需要交叉的位数
p=unidrnd(L-W+1);%随机产生一个交叉位置
%fprintf('p=%d ',p);%交叉位置
for i=1:W
    x=find(A==B(1,p+i-1));
    y=find(B==A(1,p+i-1));
    [A(1,p+i-1),B(1,p+i-1)]=exchange(A(1,p+i-1),B(1,p+i-1));
    [A(1,x),B(1,y)]=exchange(A(1,x),B(1,y));
end

end

5. Swap function exchange.m

function [x,y]=exchange(x,y)
temp=x;
x=y;
y=temp;
 
end

6. Mutation function Mutation.m

function a=Mutation(A)
index1=0;index2=0;
nnper=randperm(size(A,2));
index1=nnper(1);
index2=nnper(2);
%fprintf('index1=%d ',index1);
%fprintf('index2=%d ',index2);
temp=0;
temp=A(index1);
A(index1)=A(index2);
A(index2)=temp;
a=A;

end

7. Main function

%main
clear;
clc;
%%%%%%%%%%%%%%%输入参数%%%%%%%%
N=25;               %%城市的个数
M=100;               %%种群的个数
ITER=2000;               %%迭代次数
%C_old=C;
m=2;                %%适应值归一化淘汰加速指数
Pc=0.8;             %%交叉概率
Pmutation=0.05;       %%变异概率
%%生成城市的坐标

pos=randn(N,2);  
%randn是一种产生标准正态分布的随机数或矩阵的函数,属于MATLAB函数
%返回一个N*2的随机项的矩阵。如果N不是个数量,将返回错误信息。

%%生成城市之间距离矩阵
D=zeros(N,N);
%zeros(N,N)产生N×N的double类零矩阵
for i=1:N
    for j=i+1:N
        dis=(pos(i,1)-pos(j,1)).^2+(pos(i,2)-pos(j,2)).^2;
        D(i,j)=dis^(0.5);
        D(j,i)=D(i,j);
    end
end

%%生成初始群体

popm=zeros(M,N);
%zeros(M,N)产生M×N的double类零矩阵
for i=1:M
    popm(i,:)=randperm(N);%随机排列,比如[2 4 5 6 1 3]
end
%%随机选择一个种群
R=popm(1,:);
figure(1);
scatter(pos(:,1),pos(:,2),'rx');%画出所有城市坐标
axis([-3 3 -3 3]);
figure(2);
plot_route(pos,R);      %%画出初始种群对应各城市之间的连线
axis([-3 3 -3 3]);
%%初始化种群及其适应函数
fitness=zeros(M,1);
len=zeros(M,1);

for i=1:M%计算每个染色体对应的总长度
    len(i,1)=myLength(D,popm(i,:));
end
maxlen=max(len);%最大回路
minlen=min(len);%最小回路

fitness=fit(len,m,maxlen,minlen);
rr=find(len==minlen);%找到最小值的下标,赋值为rr
R=popm(rr(1,1),:);%提取该染色体,赋值为R
for i=1:N
    fprintf('%d ',R(i));%把R顺序打印出来
end
fprintf('\n');

fitness=fitness/sum(fitness);
distance_min=zeros(ITER+1,1);  %%各次迭代的最小的种群的路径总长
nn=M;
iter=0;
while iter<=ITER
    fprintf('迭代第%d次\n',iter);
    %%选择操作
    p=fitness./sum(fitness);
    q=cumsum(p);%累加
    for i=1:(M-1)
        len_1(i,1)=myLength(D,popm(i,:));
        r=rand;
        tmp=find(r<=q);
        popm_sel(i,:)=popm(tmp(1),:);
    end 
    [fmax,indmax]=max(fitness);%求当代最佳个体
    popm_sel(M,:)=popm(indmax,:);

    %%交叉操作
    nnper=randperm(M);
%    A=popm_sel(nnper(1),:);
 %   B=popm_sel(nnper(2),:);
    %%
    for i=1:M*Pc*0.5
        A=popm_sel(nnper(i),:);
        B=popm_sel(nnper(i+1),:);
        [A,B]=cross(A,B);
  %      popm_sel(nnper(1),:)=A;
  %      popm_sel(nnper(2),:)=B; 
         popm_sel(nnper(i),:)=A;
         popm_sel(nnper(i+1),:)=B;
    end

    %%变异操作
    for i=1:M
        pick=rand;
        while pick==0
             pick=rand;
        end
        if pick<=Pmutation
           popm_sel(i,:)=Mutation(popm_sel(i,:));
        end
    end

    %%求适应度函数
    NN=size(popm_sel,1);
    len=zeros(NN,1);
    for i=1:NN
        len(i,1)=myLength(D,popm_sel(i,:));
    end

    maxlen=max(len);
    minlen=min(len);
    distance_min(iter+1,1)=minlen;
    fitness=fit(len,m,maxlen,minlen);
    rr=find(len==minlen);
    fprintf('minlen=%d\n',minlen);
    R=popm_sel(rr(1,1),:);
    for i=1:N
        fprintf('%d ',R(i));
    end
    fprintf('\n');
    popm=[];
    popm=popm_sel;
    iter=iter+1;
    %pause(1);

end
%end of while

figure(3)
plot_route(pos,R);
axis([-3 3 -3 3]);
figure(4)
plot(distance_min);

Sixth, the control variable analyzes the influence of each parameter on the solution

1. The effect of population size on the solution The
constant parameters are as follows:
N = 25; //
Number of cities ITER = 2000; / Number of iterations
m = 2; // Normal adaptation acceleration index
Pc = 0.8; / / Crossover probability
Pmutation = 0.05; // Mutation probability
①M = 20; //
Insert picture description here
Number of population ②M = 50; //
Insert picture description here
Number of population ③M = 100; // Number of population
Insert picture description here
Conclusion: The larger the population size, the more the algorithm result Precise, the better the adaptability, but the longer the running time.
2. The effect of the number of iterations on the solution The
constant parameters are as follows:
N = 25; // Number of cities
M = 100; // Number of populations
m = 2; // Normal adaptation acceleration index
Pc = 0.8; // cross probability
Pmutation = 0.05; // mutation probability
①ITER = 1000; / number of iterations
Insert picture description here
Insert picture description here
②ITER = 1500; / number of iterations
Insert picture description here
Insert picture description here
③ITER = 2000; / number of
Insert picture description here
Insert picture description here
iterations The greater the number of iterations, the higher the correct rate of the solution. When the number of iterations reaches 1000, the running result of the algorithm tends to be stable (the shortest path is obtained stably), and the efficiency of the algorithm is relatively high. The optimal number of iterations of the algorithm is 2000.
3. The effect of the number of cities on the solution The
constant parameters are as follows:
M = 100; // Number of population
ITER = 2000; / Number of iterations
m = 2; // Normalized elimination index
Pc = 0.8; // cross probability
Pmutation = 0.05; // mutation probability
①N = 15; // number of cities
Insert picture description here
②N = 20; // number of cities
Insert picture description here
③N = 25; // number of cities
Insert picture description here
Conclusion: When the city When there are a large number, greater than 50 cities, after multiple iterations, GA still does not converge. The possible problem is that it is stuck in the local optimal solution
. 4. The effect of crossover probability on the solution The
constant parameters are as follows:
N = 25; // city the number of
m = 100; // number of population
ITER = 2000; / iterations
m = 2; // out adaptation values were normalized acceleration index
Pmutation = 0.05; // mutation probability
①Pc = 0.6; // crossover probability
Insert picture description here
②Pc = 0.7; // Crossover probability
Insert picture description here
③Pc = 0.8; // Crossover probability
Insert picture description here
Conclusion: If the crossover probability is too low, the optimal solution will not be obtained. The higher the crossover probability, the better the average fitness.
5. The effect of mutation probability on the solution The
constant parameters are as follows:
N = 25; // Number of cities
M = 100; //
Number of populations ITER = 2000; / Number of iterations
m = 2; // Adaptive value return Uniform elimination acceleration index
Pc = 0.8; // cross probability
①Pmutation = 0.01; // mutation probability
Insert picture description here
②Pmutation = 0.05; // mutation probability
Insert picture description here
③Pmutation = 0.1; // mutation probability
Insert picture description here
Conclusion: Too high or too low mutation probability will affect the optimal solution.

7. Summary

Since the genetic algorithm's overall search strategy and optimization calculation do not rely on gradient information, only the objective function and corresponding fitness function that affect the search direction are needed, so it is widely used. Combining optimization of large-scale problems using genetic calculation is a more effective method, because it is difficult to solve the optimal solution using enumeration method in the current calculation. However, genetic algorithm also has shortcomings. It has no effective quantitative analysis method for the accuracy, feasibility, and computational complexity of the algorithm. Through the algorithm in this paper, we can clearly realize that the solution obtained by the genetic algorithm is not necessarily the optimal solution.

Published 11 original articles · Likes0 · Visits 1040

Guess you like

Origin blog.csdn.net/likepanda99/article/details/103142786