Netty series: frame decoder in netty

Get into the habit of writing together! This is the 7th day of my participation in the "Nuggets Daily New Plan · April Update Challenge", click to view the details of the event .

Introduction

The data in netty is transmitted through ByteBuf. A ByteBuf may contain multiple meaningful data, which can be called frames, which means that a ByteBuf can contain multiple Frames.

For the receiver of the message, after receiving the ByteBuf, it is necessary to parse the useful data from the ByteBuf, and then the frame in the ByteBuf needs to be split and parsed.

Generally speaking, there will be some specific delimiters between different frames, and we can use these delimiters to distinguish frames, so as to realize the parsing of data.

Netty provides us with some suitable frame decoders, which can effectively simplify our work by using these frame decoders. The following figure shows several common frame decoders in netty:

Next, let's introduce the use of the above frame decoders in detail.

LineBasedFrameDecoder

LineBasedFrameDecoder from the name point of view is to distinguish the frame by line. Depending on the operating system, a newline can have two newline characters, "\n" and "\r\n".

The basic principle of LineBasedFrameDecoder is to read the corresponding characters from ByteBuf to compare with "\n" and "\r\n", which can accurately compare characters. These frameDecoders also have certain requirements for character encoding. Generally speaking, UTF-8 encoding is required. Because in this encoding, "\n" and "\r" appear as one byte and will not be used in other combined encodings, so it is very difficult to use "\n" and "\r" to judge safe.

There are several important properties in LineBasedFrameDecoder. One is the maxLength property, which is used to detect the length of the received message. If the length limit is exceeded, a TooLongFrameException exception will be thrown.

There is also a stripDelimiter property to determine whether the delimiter needs to be filtered out.

Another is failFast, if the value is true, then regardless of whether the frame is read or not, as long as the length of the frame exceeds maxFrameLength, TooLongFrameException will be thrown. If the value is false, TooLongFrameException will be thrown after the entire frame has been fully read.

The core logic of LineBasedFrameDecoder is to first find the position of the line separator, and then read the corresponding frame information according to this position. Here's a look at the findEndOfLine method to find the line separator:

    private int findEndOfLine(final ByteBuf buffer) {
        int totalLength = buffer.readableBytes();
        int i = buffer.forEachByte(buffer.readerIndex() + offset, totalLength - offset, ByteProcessor.FIND_LF);
        if (i >= 0) {
            offset = 0;
            if (i > 0 && buffer.getByte(i - 1) == '\r') {
                i--;
            }
        } else {
            offset = totalLength;
        }
        return i;
    }
复制代码

Here, a ByteBuf's forEachByte is used to traverse the ByteBuf. The character we are looking for is: ByteProcessor.FIND_LF.

Finally, the object decoded by LineBasedFrameDecoder is still a ByteBuf.

DelimiterBasedFrameDecoder

The LineBasedFrameDecoder mentioned above is only valid for line delimiters. If our frame is divided by other delimiters, the LineBasedFrameDecoder will not work, so netty provides a more general DelimiterBasedFrameDecoder, this frameDecoder can customize the delimiter:

public class DelimiterBasedFrameDecoder extends ByteToMessageDecoder {

        public DelimiterBasedFrameDecoder(int maxFrameLength, ByteBuf delimiter) {
        this(maxFrameLength, true, delimiter);
    }

复制代码

The incoming delimiter is a ByteBuf, so the delimiter may be more than one character.

In order to solve this problem, an array of ByteBuf is defined in DelimiterBasedFrameDecoder:

    private final ByteBuf[] delimiters;

    delimiters= delimiter.readableBytes();
复制代码

The delimiters are obtained by calling the delimiter's readableBytes.

DelimiterBasedFrameDecoder的逻辑和LineBasedFrameDecoder差不多,都是通过对比bufer中的字符来对bufer中的数据进行截取,但是DelimiterBasedFrameDecoder可以接受多个delimiters,所以它的用处会根据广泛。

FixedLengthFrameDecoder

除了进行ByteBuf中字符比较来进行frame拆分之外,还有一些其他常见的frame拆分的方法,比如根据特定的长度来区分,netty提供了一种这样的decoder叫做FixedLengthFrameDecoder。

public class FixedLengthFrameDecoder extends ByteToMessageDecoder 
复制代码

FixedLengthFrameDecoder也是继承自ByteToMessageDecoder,它的定义很简单,可以传入一个frame的长度:

    public FixedLengthFrameDecoder(int frameLength) {
        checkPositive(frameLength, "frameLength");
        this.frameLength = frameLength;
    }
复制代码

然后调用ByteBuf的readRetainedSlice方法来读取固定长度的数据:

in.readRetainedSlice(frameLength)
复制代码

最后将读取到的数据返回。

LengthFieldBasedFrameDecoder

还有一些frame中包含了特定的长度字段,这个长度字段表示ByteBuf中有多少可读的数据,这样的frame叫做LengthFieldBasedFrame。

netty中也提供了一个对应的处理decoder:

public class LengthFieldBasedFrameDecoder extends ByteToMessageDecoder 
复制代码

读取的逻辑很简单,首先读取长度,然后再根据长度再读取数据。为了实现这个逻辑,LengthFieldBasedFrameDecoder提供了4个字段,分别是 lengthFieldOffset,lengthFieldLength,lengthAdjustment和initialBytesToStrip。

lengthFieldOffset指定了长度字段的开始位置,lengthFieldLength定义的是长度字段的长度,lengthAdjustment是对lengthFieldLength进行调整,initialBytesToStrip表示是否需要去掉长度字段。

听起来好像不太好理解,我们举几个例子,首先是最简单的:

   BEFORE DECODE (14 bytes)         AFTER DECODE (14 bytes)
   +--------+----------------+      +--------+----------------+
   | Length | Actual Content |----->| Length | Actual Content |
   | 0x000C | "HELLO, WORLD" |      | 0x000C | "HELLO, WORLD" |
   +--------+----------------+      +--------+----------------+
复制代码

The message to be encoded has a length field, and the length field is followed by the real data. 0x000C is a hexadecimal, and the data represented is 12, which is the length of the string in "HELLO, WORLD".

Here the values ​​of the 4 properties are:

   lengthFieldOffset   = 0
   lengthFieldLength   = 2
   lengthAdjustment    = 0
   initialBytesToStrip = 0 
复制代码

Indicates that the length field starts from 0, and the length field occupies two bytes, the length does not need to be adjusted, and the field does not need to be adjusted.

Let's look at a more complex example, in which the four attribute values ​​are as follows:

   lengthFieldOffset   = 1  
   lengthFieldLength   = 2
   lengthAdjustment    = 1  
   initialBytesToStrip = 3  
复制代码

The corresponding encoded data is as follows:

BEFORE DECODE (16 bytes)                       AFTER DECODE (13 bytes)
   +------+--------+------+----------------+      +------+----------------+
   | HDR1 | Length | HDR2 | Actual Content |----->| HDR2 | Actual Content |
   | 0xCA | 0x000C | 0xFE | "HELLO, WORLD" |      | 0xFE | "HELLO, WORLD" |
   +------+--------+------+----------------+      +------+----------------+
复制代码

In the above example, the length field starts from the 1st byte (the 0th byte is HDR1), the length field occupies 2 bytes, and the length is adjusted by one byte. The starting position of the final data is 1+2+ 1=4, and then intercept the first 3 bytes of data to get the final result.

Summarize

These character set-based frame decoders provided by netty can basically meet our daily work needs. Of course, if you are transmitting some more complex objects, then you can consider custom encoding and decoding. The custom logic steps are consistent with the ones we explained above.

This article has been published on www.flydean.com/14-5-netty-…

The most popular interpretation, the most profound dry goods, the most concise tutorials, and many tricks you don't know are waiting for you to discover!

Welcome to pay attention to my official account: "Program those things", understand technology, understand you better!

Guess you like

Origin juejin.im/post/7083838820428316703
Recommended