[DLPytorch] & Text Language Model & preprocessing cycle based neural network

Text preprocessing

Implementation steps (distance data set processing language model)

Text preprocessing implementation steps

  1. The text reads: reading zip / txt like data set
  2. Word: replace newline characters to spaces. If you are working in English, it is best to uppercase to lowercase. (Because the first contact text processing, understanding very plain)
  3. Establish dictionary, each word is mapped to a unique index (index): Each character is mapped to a consecutive integers starting from 0, also known as the index, after the data processing to facilitate. To obtain an index, the data in all the different character sets taken out, one by one and then mapped to the index dictionary constructed.
    For example: [1] I; [2] group
  4. Converts text from word sequence is a sequence index for easy input model:
idx_to_char = list(set(corpus_chars))
# ['B','雕','回','窝','膀','丹'...] n times 1
char_to_idx = dict([(char, i) for i, char in enumerate(idx_to_char)])
# 索引建立完成 {'B': 0, '雕': 1, '回': 2, '窝': 3...}

Sampling time series data

Randomly sampled
random samples, each sample is taken on any of the original sequence a sequence. Two adjacent small quantities of random location on the original sequence are not necessarily adjacent. Therefore, we are unable to initialize hidden by a small batch with a hidden small quantities final time step . When training model, every time need to re-initialize hidden before random sampling.
Adjacent sampling
two random order small quantities in a position adjacent to adjacent to the original sequence. At this time, we can use a hidden small quantities to the final time step of initializing hidden by a small batch, so that the output in a small lot also depends on the input current small quantities, and so the cycle continues.
Note : That is, for example, now can be divided into four groups, Z1 (X, Y), Z2 (X, Y) .. Z4 (X, Y). The adjacent samples, and any X_i \ (X_ {i-1} \) or \ (X_ {i + 1} \) in the value are adjacent. So to say, similar to comply with the order of input time. Therefore, it can be used to save the state of hidden manner hidden under a batch initialization.
-------------

Language Model

Novice or basic instructional video to see it - - ...

Introducing the concept of

solution


Parameter Estimation


RNN

Basics


work process

Study Notes

To be added...

Guess you like

Origin www.cnblogs.com/recoverableTi/p/12307992.html