Literacy! Java variable length arrays, look at this on the right!

From: ImportNew / Qinyou Hua | Editor: Lele
link: tutorials.jenkov.com/java-performance/resizable-array.html

Sometimes we hope will save the data in a single continuous array, in order to quickly and easily access the data, but need to adjust the size of the array or its extension. Java arrays can not be resized, only the array is not sufficient to achieve their goals. Variable-length arrays of primitive types needs its own implementation. This article will show how to implement Java variable length arrays.

Why not ArrayList?

To meet the needs of beginning of the article, why not use Java ArrayList? If one of the following conditions are met, you can use ArrayList:

  • An object type stored in the array;

  • In the original type stored in the array, no particular performance or memory requirements.

Java ArrayList class applies only to objects, not for primitive types (byte, int, long, etc.). Use ArrayList store raw data types, automatically packing the object in the insertion ArrayList, will be converted to the original type when unpacking removed therefrom. Packing and unpacking at the time of insertion elements and access elements will bring additional overhead when trying to optimize for performance, you should avoid these costs (This article is part of the Java performance tracking).

Moreover, this method can not determine the position of automatic packing object is stored in memory. It may be scattered at various locations stored on the heap. Therefore, the original type of the array to be much slower compared to the speed of access, the former stored contiguously in memory.

In addition, the original type of packing will bring additional memory overhead, such as long objects will be saved to Long.

Article source

Implementation of this variable-length arrays Java source code can be downloaded from GitHub: github.com/jjenkov/java-resizable-array

Code contains three Java classes and two test units.

Example with variable length arrays

It assumes that a server receives a message different sizes. Some of the mail is very small (less than 4KB), others large (1MB or greater).

If the server is connected to receive messages from a plurality of (more than 100,000), the need to limit the pre-allocated memory for each message. Each buffer can not only on the maximum value (1MB or 16MB) to allocate memory. When increasing the number of connections and messages, this approach will quickly run out of server memory! 100_000 x 1MB = 100GB (this is an estimate, to help understand the problem).

Most assume that the message is relatively small, at the beginning you can use a smaller buffer. If the message exceeds the size of the cache, allocating a new larger array, and copies the data into the array. If the message exceeds the allocated new array, and then assign a larger array than before, and to copy the message to the array.

Using this strategy, most of the messages are usually stored only a small array. This means that the server memory has been used more efficiently. 100_000 x 4KB (small buffer) = 400MB most server should be able to handle properly. Even 4GB (1_000_000 x 4KB), the server can now meet the requirements.

Variable-length arrays design

The variable length array contains two components:

  • ResizableArray

  • ResizableArrayBuffer

ResizableArrayBuffer contains a large array. The array is divided into three parts. As a small section of the array, the array is used as a period, a period used as a large array. ResizableArray class represents a variable-length array, the underlying data is stored in ResizableArrayBuffer.

The following figure shows the array divided into three sections, each subdivided into small pieces.

By small, medium, large space reserved for different types of data, ResizableArrayBuffer can not be filled to ensure a certain size of the data. For example, small data will not take up all the memory array, thereby blocking the medium-sized and large data storage. Similarly, the received data will not occupy all the large memory, thereby blocking the small and medium-sized data storage data.

As the underlying data storage start small, if small array storage space runs out, regardless of whether there is an array of arrays or large space, you can not assign a new array. It allows the small array large enough to reduce the likelihood of this happening.

Even small array has run out, you can still put a small data become medium and large data.

Optimization

An optimization: with only one memory block. New block is allocated as needed to be extended directly behind the block. This eliminates the need to copy data from the old to the new array the array can be directly "extended" expansion block to a second memory block accommodating new data and old data, new data is written directly added. This avoids the case where the copy of all data arrays.

Disadvantage of the above optimization is that if the next block of memory can not be expanded still need to copy data. Therefore need to add "scalable" check this little operational overhead. Further, if the memory block size is too small, the data in the case of small, medium and large data present in data expansion occurs frequently.

Free block tracking

A large array of internal ResizableArrayBuffer equally divided into three sections. Each segment is divided into smaller blocks of memory; the same memory block size in each segment; the same small block size storage array; the same medium-size memory blocks in the array; the same block size large memory array.

Each segment in the same memory block size makes it easier to track the block use state. Queue may be used recording start index for each block. Also it requires a shared queue record array in each segment. Finally, a queue to keep track of free small blocks of data, a queue empty recording medium data blocks, a queue is idle for large blocks of data.

The type of data acquired from the response queue the next free block starting index, allocated memory block may be implemented from any data segments. The start index can be released back into the appropriate queue data block.

Here I use a simple ring buffer queue. GitHub repository corresponding code QueueIntFlip. Ring buffer Tutorial: tutorials.jenkov.com/java-performance/ring-buffer.html

Extended writing

When writing data to an array, variable length arrays automatically expand. If the current attempts to exceed the storage space allocated to write data array, assigning a new and larger block of memory copy of all data to a new block and then released before the small memory blocks.

Array release

Once the complete array of variable-length resizing, should be released so that it can receive other messages.

使用 ResizableArrayBuffer

The following shows how to use GitHub in ResizableArrayBuffer.

Create a ResizableArrayBuffer

First, you must create a ResizableArrayBuffer. Examples are as follows:

int smallBlockSize  =    4 * 1024;  
int mediumBlockSize =  128 * 1024;  
int largeBlockSize  = 1024 * 1024;  

int smallBlockCount  = 1024;  
int mediumBlockCount =   32;  
int largeBlockCount  =    4;  

ResizableArrayBuffer arrayBuffer =  
        new ResizableArrayBuffer(  
                smallBlockSize , smallBlockCount,  
                mediumBlockSize, mediumBlockCount,  
                largeBlockSize,  largeBlockCount);

This example creates a ResizableArrayBuffer contains a 4KB small array, the array of 128KB and 1MB large arrays. ResizableArrayBuffer small memory array 1024 comprising (co 4MB), the size of the shared array 32 in the array (a total of 4MB) and four large arrays (co 4MB), a complete total 12MB.

Examples of obtaining ResizableArray

To obtain ResizableArray instance, the call ResizableArrayBuffer getArray () method as follows:

ResizableArray resizableArray = arrayBuffer.getArray();

Here obtain a minimum ResizableArray (prior to 4KB).

Write data to ResizableArray

Call to write () method of writing data to the ResizableArray. In ResizableArray GitHub class contains only a write () method, which is a parameter ByteBuffer. However, you can add more write () methods according to their own needs.

Here is an example of write data:

public byte[] sharedArray = null;  
public int    offset      = 0;  
public int    capacity    = 0;  
public int    length      = 0;

The above code to copy the contents ResizableArray ByteBuffer array. Write () returns the number of bytes copied from ByteBuffer.

If the data contains ByteBuffer beyond the ResizableArray capacity, then ResizableArray will try to expand, the data in the ByteBuffer to make room. Even if extended to the maximum ResizableArray can not accommodate all of the data ByteBuffer, the write () method returns -1, and does not copy any data!

Read data from ResizableArray

When reading data from ResizableArray can be read directly from all instances ResizableArray directly shared array. ResizableArray contains the following public fields:

public byte[] sharedArray = null;  
public int    offset      = 0;  
public int    capacity    = 0;  
public int    length      = 0;
  • sharedArray field corresponds ResizableArray all instances shared array, i.e., the internal array ResizableArrayBuffer.

  • offset field corresponds to the starting index of shared array, ResizableArray data saved here.

  • capacity allocated to the block size field contains ResizableArray instance.

  • The length field contains the number of blocks actually used ResizableArray.

To read data written ResizableArray long as read from sharedArray [offset] to sharedArray [offset + length -1] to byte.

释放 ResizableArray

Once ResizableArray instance should be freed after use. As long as the call ResizableArray free () method may be as follows:

resizableArray.free();

Whether blocks allocated to the block size ResizableArray how to call free () can be used to return the correct queue.

Transform Design

You can modify ResizableArrayBuffer designed according to their needs. For example, which may create more than three data segments. It should also be very easy to operate, as long as the look GitHub to modify the source code.

Published 50 original articles · won praise 1706 · Views 2.22 million +

Guess you like

Origin blog.csdn.net/zl1zl2zl3/article/details/105327016