textgenrnn text generation in action

Text generation is an amazing natural language processing task, and deep learning brings a new technical approach to text generation, as described in this article The Unreasonable Effectiveness of Recurrent Neural Networks , is an incredible and efficient way. textgenrnn is a concise and efficient library that uses RNN to achieve text generation. The amount of code is very small and it is very easy to understand. Its architecture is implemented in the way of LSTM+Attention. As shown below:


In the introduction of its official website https://github.com/minimaxir/textgenrnn:

 For the default model, textgenrnn takes in an input of up to 40 characters, converts each character to a 100-D character embedding vector, and feeds those into a 128-cell long-short-term-memory (LSTM) recurrent layer. Those outputs are then fed into another 128-cell LSTM. All three layers are then fed into an Attention layer to weight the most important temporal features and average them together (and since the embeddings + 1st LSTM are skip-connected into the attention layer, the model updates can backpropagate to them more easily and prevent vanishing gradients). That output is mapped to probabilities for up to 394 different characters that they are the next character in the sequence, including uppercase characters, lowercase, punctuation, and emoji. (if training a new model on a new dataset, all of the numeric parameters above can be configured)

That is, textgenrnn accepts input of up to 40 characters, first converts each character into a 100-dimensional word (char) vector, and inputs these vectors into a long short-term memory (LSTM) recurrent layer containing 128 neurons. Second, these outputs are fed into another LSTM with 128 neurons. All three layers above are fed into an attention layer that weights the most important temporal features and averages them (since the embedding layer and the first LSTM layer are connected to the attention layer via skip connections , so model updates can be back-propagated more easily and prevent gradients from vanishing). This output is mapped onto a probability distribution of up to 394 distinct characters that are the next character in the sequence, including uppercase, lowercase, punctuation, and emojis. And the key is that the above parameters can be set.

Source code practice:

(1) The default test, which generates news.


(2) News generation in the computer field


In the above parameters, it can be seen that there is a temperature, which can be used to represent the temperature of the generated text (from the results, it seems that it can be regarded as whether the emotional color of the text is strong or not, of which 0.2 is generally negative, and 0.5 is neutral. , 1.0 represents relatively positive energy.)

In order to test different temperatures, textgenrnn comes with an example of generating different temperatures, the code is as follows

def generate_samples(self, n=3, temperatures=[0.2, 0.5, 1.0], **kwargs):
        for temperature in temperatures:
            print('#'*20 + '\nTemperature: {}\n'.format(temperature) +
                  '#'*20)
            self.generate(n, temperature=temperature, **kwargs)

So try again, you can see the results are as follows:




(3) There are already some applications based on this framework, such as creating Twitter blog posts similar to Trump.

测试代码如下:
textgen = textgenrnn('./weights/realDonaldTrump_dril_twitter_weights.hdf5','./textgenrnn/textgenrnn_vocab.json')
gen_texts=textgen.generate_samples()


The result is as simple as that.


However, some caveats are clarified in this code introduction:

textgen = textgenrnn('./weights/realDonaldTrump_dril_twitter_weights.hdf5','./textgenrnn/textgenrnn_vocab.json')
gen_texts=textgen.generate_samples()

Notes

  • You will not get quality generated text 100% of the time, even with a heavily-trained neural network. That's the primary reason viral blog posts/Twitter tweets utilizing NN text generation often generate lots of texts and curate/edit the best ones afterward.

  • Results will vary greatly between datasets. Because the pretrained neural network is relatively small, it cannot store as much data as RNNs typically flaunted in blog posts. For best results, use a dataset with atleast 2,000-5,000 documents. If a dataset is smaller, you'll need to train it for longer by setting num_epochs higher when calling a training method and/or training a new model from scratch. Even then, there is currently no good heuristic for determining a "good" model.

  • A GPU is not required to retrain textgenrnn, but it will take much longer to train on a CPU. If you do use a GPU, I recommend increasing the batch_size parameter for better hardware utilization.

For example, there are at least 2,000-5,000 training corpora, and the generated text is unstable and requires some manual editing.
textgen = textgenrnn('./weights/realDonaldTrump_dril_twitter_weights.hdf5','./textgenrnn/textgenrnn_vocab.json')
gen_texts=textgen.generate_samples()

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325565079&siteId=291194637