Generating chicken soup with 20 lines of Python code, creating AI Mimon is just around the corner

Friendly reminder: there are benefits at the end! ! !

Let me share some chicken soup with you:

“Don’t think of the overwhelming majority of the impossible.”

"Don't think about the impossible"

“Grew up your bliss and the world.”

"Strive to win your own happiness and the world"

“what we would end create, creates the ground and you are the one to warm it”

"The creation we want to end has created the earth, only you hold warmth"

“look and give up in miracles”

"Look at the miracle, give up the fantasy"

But in fact, the above chicken soup sentences are all computer-generated, and the program used to generate the chicken soup text is less than 20 lines of Python code.

When it comes to natural language generation, people usually think that this must be a very advanced AI system, using very advanced mathematics. But that is not the case. In this article I (author Ramtin Alami-translator's note) will generate new chicken soup texts using Markov chains and a small chicken soup text dataset.

Markov chain

A Markov chain is a stochastic model that independently predicts an event based on previous events. As a simple example, let's explain it with the transition of my cat's living state. My cat owner always eats, sleeps, or plays with toys. She sleeps most of the time, but occasionally wakes up to eat. Usually, after a meal, she will feel refreshed, start playing with toys, go back to sleep when she is done, and then wake up to eat.

Using the Markov chain can easily simulate the living state of my cat owner, because she will decide what to do next based on the previous state. She usually doesn't go to play with toys directly after waking up, but after eating, there is a high probability of going to play for a while. These life state transitions can also be represented graphically:

Each ellipse represents one life state, the arrow points to the next life state, and the number next to the arrow refers to the probability of her going from one state to another. We can see that the probability of state transition is basically only based on the previous life state.

Generating Text Using Markov Chains

The same idea is used in text generation using Markov chains, trying to find the probability of a word appearing after another word. To confirm the likelihood of these transformations, we train the model with some example sentences.

For example, we train the model using these sentences:

I like to eat apples. You eat oranges.

From these two training sentences above, we can conclude that "I" (me), "like" (like) and "eat" (eat) always appear in the same order, while "you" (you) and "eat" "(Eating) has been linked together. But "orange" (orange) and "apples" (apples) are equally likely to appear after the word "eat". The following conversion diagram can better illustrate the pile I talked about above:

The two training sentences can generate two new sentences, but this is not always the case. I trained another model with these four sentences below, and the results were quite different:

my friend makes the best raspberry pie in town. I think apple pies are the best pies. Steve thinks apple makes the best computers in the world. I own two computers and they're not apples because I am not steve or rich (I own two computers and they're not apple because I am not steve or rich).

The transformation graph for a model trained with these four sentences will be much larger.

While the diagram looks very different from a typical Markov chain transition diagram, the main idea behind both is the same.

The path from the start node picks the next word randomly, all the way to the end node. The width of the connected paths between words indicates the probability of the word being selected.

Although trained with only four sentences, the above model was able to generate hundreds of different sentences.

code

The code for the text generator above is very simple and does not require any additional modules or libraries other than Python's random module. The code consists of two parts, one for training and one for generation.

train

The training code constructs the model we'll use to generate chicken soup sentences later. I used a dictionary as a model, which contains some words as key points, and a column of possible following words as corresponding values. For example, a dictionary for a model trained on the sentences "I like to eat apples" and "You eat oranges" above would look like this:

{'START': ['i', 'you'], 'i': ['like'], 'like': ['to'], 'to': ['eat'], 'you': ['eat'], 'eat': ['apples

We don't need to calculate the probability of following words, because if they have a high probability, they will appear multiple times in the list of possible following words. For example, if we want to additionally add the training sentence 'we eat apples', and the word 'apples' already appears after the word 'eat' in both sentences, then it The probability of occurrence will be high. In the model's lexicon, if it appears twice in the "eat" list, it has a high probability of occurrence.

{'START': ['i', 'we', 'you'], 'i': ['like'], 'like': ['to'], 'to': ['eat'], 'you': ['eat'], 'we'

In addition, there are two terms in the above model dictionary: "start" (START) and "end" (END), which represent the start and end words of a generated sentence.

for line in dataset_file:
    line = line.lower().split()
    for i, word in enumerate(line):
        if i == len(line)-1:   
            model['END'] = model.get('END', []) + [word]
        else:    
            if i == 0:
                model['START'] = model.get('START', []) + [word]
            model[word] = model.get(word, []) + [line[i+1]] 

Generating Chicken Soup Sentences

The generator section contains a loop. It first picks a random starting word and adds it to a list, then searches the dictionary for a list of potential follower words, picks a random list, and adds the newly picked word to that list. The generator keeps choosing random potential follower words until it finds the ending word, then it stops the loop and outputs the generated sentence or so-called "quote".

import random 

generated = []
while True:
    if not generated:
        words = model['START']
    elif generated[-1] in model['END']:
        break
    else:
        words = model[generated[-1]]
    generated.append(random.choice(words))

I have generated quite a few chicken soup texts using Markov chains, but as a text generator, you can input any text and have it generate similar sentences.

Another cool thing you can do with Markov chain text generators is mix different text types. For example, in one of my favorite TV shows, Rick and Morty, there is a character called "Abradolf Lincler" who uses the two words "Abraham Lincoln" and "Adolf Hitler." A mix of people's names.

You can do the same thing, feed some famous names into the Markov Chain and have it generate a fun mix of names, (such as Gorda Statham

Nicholas Zhao Si

You can even go a step further and combine some famous quotes, such as the speech sentences of Lincoln and Hitler mentioned above, with Markov chains to generate a new style of speech.

Markov chains can be applied in almost all fields. Although text generation is not the most useful application, I do think this application is very interesting. What if the chicken soup text you produce will attract more fans than Mimon?

It's not too late to make chicken soup after accepting this benefit:

Tonight at 20:00, continue live broadcast of "Introduction to Python with Negative Basics"! free! free! Free !

Click here for the live address .

For a summary of past recordings, click here

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325982023&siteId=291194637