Why is this genetic algorithm taking too many iterations?

Souames :

I'm learning about genetic algorithms and in order to better understand the concepts I tried to build genetic algorithm from scratch using python without using any external module (just the standard library and a little bit of numpy)

The goal is to find a target string, so if I give it the string hello and define 26 chars + a space, there are 26^5 possibilities which is huge. Thus the need to use a GA to solve this probem.

I defined the following functions:

Generate population : we generate the population given size n and a target we generate n string having len(target) of random chars, we return the population as a list of str

Compute a fitness score: if the char at position i is equal to the char at position i of target we increment the score, here's the code:

def fitness(indiv,target):
    score = 0
    #print(indiv," vs ",target)
    for idx,char in enumerate(list(target)):

        if char == indiv[idx]:
            score += 1
        else:
            score = 0
    return score

Select parents, crossing between parents and generating a new population of children

Here are the function responsible for that:

from numpy.random import choice

def crossover(p1,p2):
    # we define a crossover between p1 and p2 (single point cross over)
    point = random.choice([i for i in range (len(target))])
    #print("Parents:",p1,p2)
    # C1 and C2 are the new children, before the cross over point they are equalt to their prantes, after that we swap
    c = [p1[i] for i in range(point)]

    #print("Crossover point: ",point)

    for i in range(point,len(p1)):
        c.append(p2[i])
    #print("Offsprings:", c1," and ", c2)
    c = "".join(c)
    # we mutate c too
    c = mutate(c)
    return c


def mutate(ind):
    point = random.choice([i for i in range (len(target))])
    new_ind = list(ind)
    new_ind[point] = random.choice(letters)
    return "".join(new_ind)

def select_parent(new_pop,fit_scores):
    totale = sum(fit_scores)
    probs = [score/totale for score in fit_scores]
    parent = choice(new_pop,1,p=probs)[0]
    return parent

I'm selecting parents by computing the probabilities of each individual (individual score/ total score of population), then using a weighted random choice function to select a parent (this is a numpy function).

For the crossover, I'm generating a child c and a random splitting point, all chars before this random point are the first parent chars, and all chars after the splitting point are chars from the parent.

besides that I defined a function called should_stop which check whether we found the target, and print_best which gets the best individuals out of a population (highest fitness score).

Then I created a find function that use all the functions defined above:

def find(size,target,pop):
    scores = [fitness(ind,target) for ind in pop]
    #print("len of scores is ", len(scores))
    #good_indiv = select_individuals(pop,scores)
    #print("Length of good indivs is", len(good_indiv))
    new_pop = []
    # corssover good individuals
    for ind in pop:
        pa = select_parent(pop,scores)
        pb = select_parent(pop,scores)
        #print(pa,pb)
        child = crossover(pa,pb)
        #print(type(child))
        new_pop.append(child)
    best = print_best(new_pop,scores)
    print("********** The best individual is: ", best, " ********")
    return (new_pop,best)


n = 200
target = "hello"
popu = generate_pop(n,target)

#find(n,target,popu)


for i in range(1000):
    print(len(popu))
    data = find(n,target,popu)
    popu = data[0]
    print("iteration number is ", i)
    if data[1] == target:

        break

The Problem The problem is that it's taking too many iterations than it shoud be to generate hello (more than 200 iterations most of the time), while in this example, it only takes few iterations: https://jbezerra.github.io/The-Shakespeare-and-Monkey-Problem/index.html

Sure the problem is not coded in the same way, I used python and a procedural way to code things but the logic is the same. So what I'm doing wrong ?

yurib :

You mutate 100% of the time. You select 'suitable' parents which are likely to produce a fit offspring, but then you apply a mutation that's more likely than not to "throw it off". The example link your provided behaves the same way if you increase mutation rate to 100%.

The purpose of mutation is to "nudge" the search in a different direction if you appear to be stuck in a local optimum, applying it all the time turns this from an evolutionary algorithm to something much closer to random search.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=376503&siteId=1