Netty series: xml codec commonly used in netty

Get into the habit of writing together! This is the 15th day of my participation in the "Nuggets Daily New Plan·April Update Challenge", click to view the details of the event .

Introduction

Before json, xml is the most commonly used data transmission format. Although xml has a lot of redundant data, the structure of xml is simple and clear, and it is still used in different places in the program. support.

Netty's support for xml is manifested in two aspects. The first aspect is to split the encoded multiple xml data into frames, and each frame contains a complete xml. On the other hand, semantic parsing of xml is performed on the segmented frame.

You can use XmlFrameDecoder for frame splitting, and XmlDecoder for parsing the content of xml files. Next, we will explain the implementation and use of the two decoders in detail.

XmlFrameDecoder

Because we receive a data stream, we are not sure what the received data is. A normal xml data may be split into multiple data frames.

As follows:

   +-------+-----+--------------+
   | <this | IsA | XMLElement/> |
   +-------+-----+--------------+
复制代码

This is a normal xml data, but it has been split into three frames, so we need to merge it into one frame as follows:

   +-----------------+
   | <thisIsAXMLElement/> |
   +-----------------+

复制代码

There may also be cases where different xml data is split in multiple frames, as follows:

   +-----+-----+-----------+-----+----------------------------------+
   | <an | Xml | Element/> | <ro | ot><child>content</child></root> |
   +-----+-----+-----------+-----+----------------------------------+
复制代码

The above data needs to be split into two frames:

   +-----------------+-------------------------------------+
   | <anXmlElement/> | <root><child>content</child></root> |
   +-----------------+-------------------------------------+
复制代码

The logic of splitting is very simple, mainly by judging the position of the xml separator to determine whether the xml starts or ends. There are three delimiters in xml, they are '<', '>' and '/'.

In the decode method, only these three delimiters need to be judged.

There are also some additional judgment logic, such as whether it is a valid xml start character:

    private static boolean isValidStartCharForXmlElement(final byte b) {
        return b >= 'a' && b <= 'z' || b >= 'A' && b <= 'Z' || b == ':' || b == '_';
    }
复制代码

Is it a comment:

    private static boolean isCommentBlockStart(final ByteBuf in, final int i) {
        return i < in.writerIndex() - 3
                && in.getByte(i + 2) == '-'
                && in.getByte(i + 3) == '-';
    }

复制代码

Is it CDATA data:

    private static boolean isCDATABlockStart(final ByteBuf in, final int i) {
        return i < in.writerIndex() - 8
                && in.getByte(i + 2) == '['
                && in.getByte(i + 3) == 'C'
                && in.getByte(i + 4) == 'D'
                && in.getByte(i + 5) == 'A'
                && in.getByte(i + 6) == 'T'
                && in.getByte(i + 7) == 'A'
                && in.getByte(i + 8) == '[';
    
复制代码

After using these methods to determine the starting position of the xml data, you can call the extractFrame method to copy the ByteBuf to be used from the original data, and finally put it into out:

final ByteBuf frame =
                    extractFrame(in, readerIndex + leadingWhiteSpaceCount, xmlElementLength - leadingWhiteSpaceCount);
            in.skipBytes(xmlElementLength);
            out.add(frame);
复制代码

XmlDecoder

After splitting the xml data into frames, the next step is to parse the specific data in the xml.

Netty provides an xml data parsing method called XmlDecoder, which is mainly used to parse the substantial content of a frame that is already a single xml data. Its definition is as follows:

public class XmlDecoder extends ByteToMessageDecoder 
复制代码

XmlDecoder splits the xml part into XmlElementStart, XmlAttribute, XmlNamespace, XmlElementEnd, XmlProcessingInstruction, XmlCharacters, XmlComment, XmlSpace, XmlDocumentStart, XmlEntityReference, XmlDTD and XmlCdata according to the read xml content.

These data basically cover all possible elements in xml.

All these elements are defined in the io.netty.handler.codec.xml package.

But XmlDecoder's reading and parsing of xml borrows a third-party xml toolkit: fasterxml.

XmlDecoder uses AsyncXMLStreamReader and AsyncByteArrayFeeder in fasterxml to parse xml data.

These two properties are defined as follows:

    private static final AsyncXMLInputFactory XML_INPUT_FACTORY = new InputFactoryImpl();
    private final AsyncXMLStreamReader<AsyncByteArrayFeeder> streamReader;
    private final AsyncByteArrayFeeder streamFeeder;

            this.streamReader = XML_INPUT_FACTORY.createAsyncForByteArray();
        this.streamFeeder = (AsyncByteArrayFeeder)this.streamReader.getInputFeeder();
复制代码

The logic of decode is to read different data by judging the type of xml element, and finally encapsulate the read data into various xml objects mentioned above, and finally add the xml object to the out list and return.

Summarize

We can use XmlFrameDecoder and XmlDecoder to achieve very convenient xml data parsing. Netty has already built the wheel for us, so we don't need to invent it by ourselves.

This article has been published on www.flydean.com/14-7-netty-…

The most popular interpretation, the most profound dry goods, the most concise tutorials, and many tricks you don't know are waiting for you to discover!

Welcome to pay attention to my official account: "Program those things", understand technology, understand you better!

Guess you like

Origin juejin.im/post/7086726582907174919