AI algorithms of Encoder-Decoder and Seq2Seq

Encoder-Decoder is a field of NLP modeling framework. It is widely used in machine translation, speech recognition and other tasks.

1 What is Encoder-Decoder?

Encoder-Decoder model is mainly in the field of NLP concept. It is not certain bit value specific algorithm, but the algorithm for a class collectively. Encoder-Decoder regarded as a common framework that can use different algorithms within this framework to address different tasks.

Encoder-Decoder this framework a good interpretation of the core idea of ​​machine learning:将现实问题转化为数学问题,通过求解数学问题,从而解决现实问题。

Encoder also referred to as an encoder. Its role is "the real problem into a mathematical problem."

Decoder also known as a decoder, his role is "to solve mathematical problems, and translated into real-world solutions."

The two links are connected with the common FIG expressed like the following:

About Encoder-Decoder, there are two caveats:

  • a. Whatever the length of the input and output are, in the middle of "vector c 'is fixed length (which is also its defects will be described in detail below)

  • b. Depending on the task can select different encoder and decoder (which may be a RNN, but is usually or variants thereof LSTM GRU)

As long as the framework is in line with the above, are collectively referred to as Encoder-Decoder model. Speaking Encoder-Decoder model is often referred to a noun - Seq2Seq.

2 What is Seq2Seq?

Seq2Seq (is the abbreviation of Sequence-to-sequence), as literally, a sequence of inputs, outputs another sequence. This structure is the most important place is input sequence and an output sequence length is variable. For example, FIG:

As Figure: six characters input, three outputs English words. Input and output of different lengths.

Relationship "Seq2Seq" and "Encoder-Decoder" in

Seq2Seq (emphasis on purpose) is not specific to particular methods, to meet the "input sequence, the output sequence," the purpose, can be collectively referred to as Seq2Seq model.

The particular method used substantially fall Seq2Seq Encoder-Decoder Model (emphasis method) category.

To summarize, then:

  • Seq2Seq 属于 Encoder-Decoder 的大范畴
  • Seq2Seq 属于 Encoder-Decoder 的大范畴

3 Encoder-Decoder 有哪些应用?

  • 「文本 – 文本」 是最典型的应用,其输入序列和输出序列的长度可能会有较大的差异。

  • 语音识别(音频 – 文本)

4 Encoder-Decoder 的缺陷

上文提到:Encoder(编码器)和 Decoder(解码器)之间只有一个「向量 c」来传递信息,且 c 的长度固定。

为了便于理解,我们类比为「压缩-解压」的过程:

将一张 800X800 像素的图片压缩成 100KB,看上去还比较清晰。再将一张 3000X3000 像素的图片也压缩到 100KB,看上去就模糊了。

5 Attention 解决信息丢失问题

Attention 机制就是为了解决「信息过长,信息丢失」的问题。

Attention 模型的特点是 Eecoder 不再将整个输入序列编码为固定长度的「中间向量 C」 ,而是编码成一个向量的序列。引入了 Attention 的 Encoder-Decoder 模型如下图:

这样,在产生每一个输出的时候,都能够做到充分利用输入序列携带的信息。而且这种方法在翻译任务中取得了非常不错的成果。

发布了486 篇原创文章 · 获赞 470 · 访问量 35万+

Guess you like

Origin blog.csdn.net/Mind_programmonkey/article/details/104335443