Memory management of Netty core concepts (10)

1 Introduction

 The previous chapters have analyzed all the content that can be seen in the startup demo. An overall appearance of Netty is drawn at the end of the thread model in Section 8. These contents explain why Netty is an asynchronous event-driven program, and also explain the efficiency of Netty's threading model, but one aspect that is not involved is the parsing process of Handler. Through the previous knowledge points, we should all understand that Handler is used to parse the acquired data according to relevant protocols. Java NIO is read and written through buffers. Another efficiency of Netty is not involved here, that is Memory management, this phase occurs when the handler reads data.

 What method does Netty use to manage memory? Why do you need memory management? The first question is the main content of this section, and the second question is answered here. The reason is that IO operations involve frequent memory allocation and destruction. If memory space is allocated on the heap, GC operations will be very frequent, which has a great loss of performance, so Netty uses the off-heap memory provided by JDK1.5 Allocation, the program can directly operate the memory space that is not under the jurisdiction of the JVM, but it needs to control the creation and destruction of memory by itself. By using off-heap memory, frequent GC is avoided, but it brings another problem. The efficiency of off-heap memory creation is very low, so the frequent creation of off-heap memory is even worse. Based on the above reasons, Netty finally designed an off-heap memory pool, applied for a large piece of memory space, and then provided a management interface for this memory space, so that the application layer can directly get relevant data without paying attention to memory operations. But Netty has not completely given up on opening up memory on the heap, and has provided a corresponding interface.

 To understand how Netty manages memory, there are a few things that need to be understood.

final ByteBufAllocator allocator = config.getAllocator();
 final RecvByteBufAllocator.Handle allocHandle = recvBufAllocHandle();
 byteBuf = allocHandle.allocate(allocator);
 allocHandle.lastBytesRead(doReadBytes(byteBuf));

 This is a piece of code in the read method of NioByteUnsafe, you can see a basic idea of ​​the handler for read: get ByteBufAllocator, then get RecvByteBufAllocator.Handle, get a ByteBuf through these two classes, and finally write data into this ByteBuf. This step illustrates some concepts related to reading data. Let's understand that Nio's Allocator is the allocator field of DefaultChannelConfig, which is finally generated in the static method of ByteBufUtil. It decides which type to use according to the io.netty.allocator.type parameter. There are two types: UnpooledByteBufAllocator and PooledByteBufAllocator by default, that is, use the memory pool or not. The Android platform defaults to unpooled, and other platforms default to pooled. Nio's default RecvByteBufAllocator is AdaptiveRecvByteBufAllocator, which is set in the constructor of DefaultChannelConfig. Knowing these, let's look at the architecture of these three classes separately.

2 ByteBufAllocator

 This interface instance must be used in a thread-safe environment. The methods defined by this interface are divided into the following types:

    buffer(): Allocate a ByteBuf, whether it is direct (direct memory cache) or heap (heap memory cache) is determined by the specific implementation.

    ioBuffer(): Allocate a ByteBuf, preferably a direct memory buffer suitable for IO operations.

    heapBuffer(): allocates a heap memory buffer

    directBuffer(): allocates a direct direct memory buffer

    compositeBuffer(): allocate a composite buffer, whether it is direct or heap depends on the specific implementation

    compositeHeapBuffer(): allocate a composite heap buffer

    compositeDirectBuffer(): allocates a composite direct buffer

    isDirectBufferPooled(): Is it a direct buffer pool

    calculateNewCapacity(): Calculate the capacity used when the ByteBuffer needs to be expanded

 According to the interface method, we can clearly know that ByteBuf is divided into two types: direct and heap, and because it is divided into pool and unpool, the Cartesian product is 4 types. The definition of composite types will be explained in ByteBuf.

 2.1 AbstractByteBufAllocator

 The abstract parent class defines some default data: 1. The default cache initial size is 256, the maximum Integer.MAX_VALUE, components are 16, and the threshold is 1048576 * 4 (4MB).

 The abstract parent class provides a static method toLeakAwareBuffer that wraps the ByteBuf class into a memory leak detection method. It also determines whether the ambiguous method in the above interface definition uses direct or buffer through the parameter preferDirect of the constructor. If it is true and then the platform conditions are met, direct is used, otherwise it is heap. The ioBuffer method only needs to meet the platform support, and it will use direct first, which is consistent with the method description. If the platform does not support it, heap will be used. The compositeBuffer related methods directly create the CompositeByteBuf object, wrap it and return it through the toLeakAwareBuffer method. The logic of calculateNewCapacity is to return directly if minNewCapacity is equal to threshold. If it is greater than it returns the value that increases the threshold. Less than from 64*2 until minNewCapacity is exceeded.

 Finally, the abstract parent class wraps the basic methods, and only the newHeapBuffer and newDirectBuffer methods are left to the subclass to implement.

2.2 PooledByteBufAllocator

 The class construction parameter preferDirect is false, so it is more inclined to use heap memory, of course, it depends on which method is used. The default pageSize is 8192, maxOrder is 11, chunkSize is 8192<<11, tinyCacheSize is 512, smallCacheSize is 256, and normalCacheSize is 64. It is normal to have doubts about these parameters. For details, please read Netty's memory management article: here . A simple look will understand the meaning of these parameters.

    protected ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity) {
        PoolThreadCache cache = threadCache.get();
        PoolArena<byte[]> heapArena = cache.heapArena;

        final ByteBuf buf;
        if (heapArena != null) {
            buf = heapArena.allocate(cache, initialCapacity, maxCapacity);
        } else {
            buf = PlatformDependent.hasUnsafe() ?
                    new UnpooledUnsafeHeapByteBuf(this, initialCapacity, maxCapacity) :
                    new UnpooledHeapByteBuf(this, initialCapacity, maxCapacity);
        }

        return toLeakAwareBuffer(buf);
    }


    protected ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity) {
        PoolThreadCache cache = threadCache.get();
        PoolArena<ByteBuffer> directArena = cache.directArena;

        final ByteBuf buf;
        if (directArena != null) {
            buf = directArena.allocate(cache, initialCapacity, maxCapacity);
        } else {
            buf = PlatformDependent.hasUnsafe() ?
                    UnsafeByteBufUtil.newUnsafeDirectByteBuf(this, initialCapacity, maxCapacity) :
                    new UnpooledDirectByteBuf(this, initialCapacity, maxCapacity);
        }

        return toLeakAwareBuffer(buf);
    }

 The above is what implementing the abstract parent class does not implement. It can be seen that it is divided into several steps: 1. Get threadCache, thread local cache. 2. Get the corresponding PoolArena, the heap is the byte[] object, and the direct is the ByteBuffer object. 3. Assign one if it exists, and create one if it does not exist. 4. Finally, it is packaged into a buffer for memory leak detection through toLeakAwareBuffer.

 threadCache shows that this cache is bound to threads, so the previous interface stated that thread safety must be guaranteed when using it.

2.3 UnpooledByteBufAllocator

 There is not much to explain without the use of pooling technology, which is not the focus of this chapter.

    protected ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity) {
        return PlatformDependent.hasUnsafe() ?
                new InstrumentedUnpooledUnsafeHeapByteBuf(this, initialCapacity, maxCapacity) :
                new InstrumentedUnpooledHeapByteBuf(this, initialCapacity, maxCapacity);
    }

    protected ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity) {
        final ByteBuf buf;
        if (PlatformDependent.hasUnsafe()) {
            buf = noCleaner ? new InstrumentedUnpooledUnsafeNoCleanerDirectByteBuf(this, initialCapacity, maxCapacity) :
                    new InstrumentedUnpooledUnsafeDirectByteBuf(this, initialCapacity, maxCapacity);
        } else {
            buf = new InstrumentedUnpooledDirectByteBuf(this, initialCapacity, maxCapacity);
        }
        return disableLeakDetector ? buf : toLeakAwareBuffer(buf);
    }

 All are returned directly without special instructions.

3 RecvByteBufAllocator

 The interface is a method that generates a Handler.

 Handler has been replaced by ExtendedHandler, but it also directly inherits Handler. allocate provides an allocation method, and handler is to manage how to allocate.

  allocate(): Allocate a ByteBuf, big enough to get all the input data, but small enough not to waste too much space.

  guess(): guess how much space should be allocated by allocate

  reset(): resets the accumulated count and suggests how many bytes or messages to read in the next read loop. The continueReading method should be used to determine whether the read operation has ended

  incMessagesRead(): Increase the count of messages read in this read loop

  lastBytesRead(): Set the bytes obtained by the last read operation

  attemptedBytesRead(): Sets the number of bytes that a read operation needs to try to read

  continueReading(): Determines whether the current reading loop continues

  readComplete(): The read operation has completed.

3.1 AdaptiveRecvByteBufAllocator

  This class also sets some default parameters, the minimum size is 64, the maximum size is 65536, the initial size is 1024, the index growth step is 4, and the decrease is 1.

    static {
        List<Integer> sizeTable = new ArrayList<Integer>();
        for (int i = 16; i < 512; i += 16) {
            sizeTable.add(i);
        }

        for (int i = 512; i > 0; i <<= 1) {
            sizeTable.add(i);
        }

        SIZE_TABLE = new int[sizeTable.size()];
        for (int i = 0; i < SIZE_TABLE.length; i ++) {
            SIZE_TABLE[i] = sizeTable.get(i);
        }
    }

 The static method initializes a SIZE_TABLE field of type int[]. It can be seen from this logic that the changed fields are stored: 16, 32, 48, 64...496, 512, 1024, 2048....2^31. these elements. What do these fields mean? Specifically look at the method using this table:

    private static int getSizeTableIndex(final int size) {
        for (int low = 0, high = SIZE_TABLE.length - 1;;) {
            if (high < low) {
                return low;
            }
            if (high == low) {
                return high;
            }

            int mid = low + high >>> 1;
            int a = SIZE_TABLE[mid];
            int b = SIZE_TABLE[mid + 1];
            if (size > b) {
                low = mid + 1;
            } else if (size < a) {
                high = mid - 1;
            } else if (size == a) {
                return mid;
            } else {
                return mid + 1;
            }
        }
    }

 This method name shows that its function is to obtain the subscript index of the table, which is generally a binary search idea. Given an initial size, if the value is larger than the middle value to the right, continue, low is reset; smaller than the middle value, also continue, high is reset. If it is equal to the middle value, the subscript is returned. If it is between the middle value and the position one larger than the middle value, the content with the larger middle value is returned.

 This is not easy to understand. Let’s look at the allocate method defined by the interface above. The goal is not to waste too much space, but also to be able to read all the content. With such a premise, this operation is easy to understand. This is the size setting of a group of space allocations actually designed by developers, and according to the size, to obtain a suitable allocation size. This method is to get the subscript of the table size array of the appropriate size. Therefore, between the intermediate value and the position greater than the intermediate value, the position greater than the intermediate value is returned, ensuring that these two contents are sufficient and the waste is minimal. As for why the previous accumulation is 16, it becomes *2 accumulation when it reaches 512. This estimate is designed in relation to the actual situation.

3.2 HandleImpl

 This is a Handler provided by AdaptiveRecvByteBufAllocator. Go back to the original code, process the allocator through the handler of recvallocator, and finally obtain the read buffer. Of the 11 methods defined by ExtendedHandle, only 4 are implemented here, and the others are implemented by the parent class MaxMessageHandle. First look at the relevant content of the MaxMessageHandler class.

  

  maxMessagePerRead: The maximum amount of data to read at one time

  totalMessages: the total amount of messages

  totalBytesRead: The total amount of bytes (as mentioned earlier, divided into message and byte)

  attemptedBytesRead: The number of bytes attempted to read

  lastBytesRead: The last number of bytes read

 Most of the methods directly return the relevant fields, and other contents are not worth mentioning, and need to be analyzed in combination with actual use. The allocate method actually calls the ioBuffer method of alloc, that is, the direct mode is used as much as possible.

 HandleImpl doesn't actually have much content. The most difficult thing to understand is that the above-mentioned calculation of obtaining a buffer of suitable size requires the size of the step.

 To summarize: The purpose of RecvByteBufAllocator is only one, in order to open up a buffer of suitable size.

4. ByteBuf

 This concept is the most important and the least important part. The ByteBuf abstract class implements ReferenceCounted, which is a key point of the direct memory designed by Netty. As mentioned above, there are frequent GCs in heap mode. If the direct mode is frequently opened and destroyed, the performance will be lower, so the Pool method is adopted to manage direct. In fact, the technology of using the pool also needs to mark the used and unused areas, and it needs to be released after the use is complete. Netty adopts a GC strategy, reference counting. There is a class that references the Buffer, +1, -1 when it is released. When it is 0, it is not used, and the area can be released at this time. This is what ReferenceCounted does, reference counting.

  refCnt: ​​the current count, 0 means retraction

  retain: reference +1, or + specify the value

  touch: used for debugging, records the current position of accessing the object. ResourceLeakDetector can provide relevant information.

  release: reference -1, or - specify a value

 There are many methods of abstracting the parent class ByteBuf. I will not take screenshots here, but briefly introduce the description (too many, most of them can understand the meaning by looking at the name):

  capacity(): current capacity; incoming value, if it is smaller than the current capacity, it will be truncated, and if it is larger than the current capacity, it will be expanded to this value.

  maxCapacity(): the maximum capacity allowed

  alloc(): Get the alloc object that opened up this cache

  order(): Set the byte order in which the bytecode is parsed. It may be clearer to say another way (Big-endian and Little-endian). For details, see wiki: here . This method is deprecated, and getShort and getShortLE are used directly later to distinguish.

  unwarp(): If the class is a wrapper class, return the unwrapped object.

  isDirect(): whether it is direct memory

  isReadOnly(): Is it read-only

  asReadOnly(): Returns a read-only version of ByteBuf

  readerIndex(): Returns the reading index, with parameters to set this value

  writerIndex(): Returns the write index, with parameters to set

 Some other methods are: judging the readable and writable status, and the related number of bytes. Clear content, mark read position, reset read position, mark write and reset write. Some read bytes are discarded. Then there is the basic method of reading basic types such as int, short, etc., and there is one more LE method mentioned here. skip some bytes. Copy the cache, return readable byte fragments, copy bytes, memory addresses and a series of methods.

 The relevant method definitions of ByteBuf are actually not very different from the ByteBuffer provided by JAVA, and the consistency is maintained. Below we only introduce two kinds of ByteBuf.

4.1 PooledUnsafeDirectByteBuf

 This kind of buf may be what we want to study, from the top-level abstract parent class to see its related operations layer by layer.

  1.AbstractByteBuf:

  This abstract class solves the basic method: readable and writable related judgments. Clear, mark read and write locations, discard bytes. The other methods are generally implemented by the subclass, and the parent class just determines the security, such as whether getInt has 2 readable bytes or the like.

  2.AbstractReferenceCountedByteBuf:

  The biggest role of the abstract parent class is to complete the previous reference count problem and update it through CAS operations. Implementation of related methods.

  3.PooledByteBuf: This method is the operation of the relevant memory pool.

  This class implements some basic capacity, slice, duplicate, deallocate related methods, basically through the parameters in the above figure.

 The last is the PooledUnsafeDirectByteBuf class we want to talk about, which implements related methods through the UnsafeByteBufUtil class. In fact, these methods are not complicated in the past, because the complicated parts are simply omitted, such as how to operate when allocating the cache, that is in the Pool method, the method of using the cache is not complicated, but there are two The content is also ignored, one is leakDetector and the other is recycler. These are not introduced.

4.2 CompositeByteBuf

 This class has not been introduced clearly before. What does the combined ByteBuf mean? Actually this class is related to zero copy in Netty. This zero copy is not the same concept as the zero copy at the operating system level mentioned in the first section. Netty's zero copy refers to the memory copy between user states. That is, reducing the number of copies from one location of the user address to another. For example, the file FileRegion can directly transfer the file buffer to the Channel through the transferTo method, instead of obtaining the data through the while loop and then transferring it. Another example is that a message consists of multiple modules, such as request headers and request bodies. It is usually merged into a Buffer, which produces copying data in user mode. This solution is now called CompositeByteBuf, combining ByteBuf. Add components to this buffer, and transmit these data transparently to the operation layer.

 components is an ArrayList, the element is ByteBuf, maxNumComponents is the maximum number of components, the default is 16.

    private int addComponent0(boolean increaseWriterIndex, int cIndex, ByteBuf buffer) {
        assert buffer != null;
        boolean wasAdded = false;
        try {
            checkComponentIndex(cIndex);

            int readableBytes = buffer.readableBytes();

            // No need to consolidate - just add a component to the list.
            @SuppressWarnings("deprecation")
            Component c = new Component(buffer.order(ByteOrder.BIG_ENDIAN).slice());
            if (cIndex == components.size()) {
                wasAdded = components.add(c);
                if (cIndex == 0) {
                    c.endOffset = readableBytes;
                } else {
                    Component prev = components.get(cIndex - 1);
                    c.offset = prev.endOffset;
                    c.endOffset = c.offset + readableBytes;
                }
            } else {
                components.add(cIndex, c);
                wasAdded = true;
                if (readableBytes != 0) {
                    updateComponentOffsets(cIndex);
                }
            }
            if (increaseWriterIndex) {
                writerIndex(writerIndex() + buffer.readableBytes());
            }
            return cIndex;
        } finally {
            if (!wasAdded) {
                buffer.release();
            }
        }
    }    

    private Component findComponent(int offset) {
        checkIndex(offset);

        for (int low = 0, high = components.size(); low <= high;) {
            int mid = low + high >>> 1;
            Component c = components.get(mid);
            if (offset >= c.endOffset) {
                low = mid + 1;
            } else if (offset < c.offset) {
                high = mid - 1;
            } else {
                assert c.length != 0;
                return c;
            }
        }

        throw new Error("should not reach here");
    }

 You can see the method of adding components and locating which component, which simply realizes the combination of multiple ByteBufs and provides a transparent interface to the outside world, without opening up a new space and copying the relevant data.

5 Postscript

 Although the title of this section is called memory management, it has less core content and involves algorithms, which is rather cumbersome. But generally introduced Netty's memory design ideas, and implemented the related classes. Summarized as follows:

    1.Netty has roughly 4 memory methods, Heap Pool, Direct Pool, Heap unPool, and Dircet unPool.

    2. The general difference between Pool and UnPool

    3. ByteBuf uses reference counting to manage this allocated memory

    4. CompositeByteBuf, one of Netty's zero-copy implementations, combines multiple ByteBufs to provide an interface for operating one.

 The implementation of the most important Pool method is not explained, and the memory leak detection, recycle and other contents are also not explained. All of these need to be figured out for yourself. Attached is an introduction to this algorithm from another blog mentioned above: here , as a missing supplement to the content of this chapter. Add the relevant knowledge of Netty zero copy: here .

 Finally, give an example of others using Netty to optimize: Here , combined with the previous knowledge, this should be understandable. In addition, this series basically ends here, others should not be introduced in detail, and some specific demos may be added later.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325799010&siteId=291194637