86. Encoder-Decoder Architecture and Code Implementation

1. Revisiting CNN

insert image description here

2. Re-examine RNN

insert image description here

3. Encoder-Decoder Architecture

insert image description here

4. Summary

  • A model using an encoder-decoder architecture, where the encoder is responsible for representing the input and the decoder is responsible for the output

5. Code implementation

5.1 Encoder

In the encoder interface, we only specify variable-length sequences as input X to the encoder. Any Encoder基类model that inherits from this will complete the code implementation.

from torch import nn

class Encoder(nn.Module):
    """编码器-解码器架构的基本编码器接口"""
    def __init__(self, **kwargs):
        super(Encoder, self).__init__(**kwargs)

    def forward(self, X, *args):
        raise NotImplementedError

5.2 Decoder

In the decoder interface below, we add a init_statefunction to convert the output of the encoder (enc_outputs) to the encoded state . Note that this step may require additional input, for example: the effective length of the input sequence. In order to generate variable-length token sequences one by one, the decoder at each time step maps the input (e.g. tokens generated at the previous time step) and the encoded state to the output tokens at the current time step.

class Decoder(nn.Module):
    """编码器-解码器架构的基本解码器接口"""
    def __init__(self, **kwargs):
        super(Decoder, self).__init__(**kwargs)

    # enc_outputs是encoder所有的输出
    def init_state(self, enc_outputs, *args):
        raise NotImplementedError

    # state一开始是从encoder拿过来,之后不断更新
    # X是额外的输入
    def forward(self, X, state):
        raise NotImplementedError

5.3 Merging Encoder and Decoder

In summary, an "encoder-decoder" architecture consists of an encoder and a decoder, with optional additional parameters . In 前向传播, the output of the encoder is used to generate the encoded state , which in turn is used by the decoder as part of its input.

class EncoderDecoder(nn.Module):
    """编码器-解码器架构的基类"""
    def __init__(self, encoder, decoder, **kwargs):
        super(EncoderDecoder, self).__init__(**kwargs)
        self.encoder = encoder
        self.decoder = decoder

    def forward(self, enc_X, dec_X, *args):
    	# enc_X 是encoder的输入
        enc_outputs = self.encoder(enc_X, *args)
        # 把encoder的输出拿到解码器的init_state中,变成了解码器的初始状态
        dec_state = self.decoder.init_state(enc_outputs, *args)
        # 再把中间状态dec_state和decoder的输入dec_X传入解码器
        return self.decoder(dec_X, dec_state)

The term state in the "encoder-decoder" architecture inspires the use of stateful neural networks to implement the architecture. In the next section, we will learn how to apply recurrent neural networks to design sequence transformation models based on the "encoder-decoder" architecture.

Guess you like

Origin blog.csdn.net/weixin_47505105/article/details/128729706