Why is Disruptor so fast

Disruptor is an open source and efficient producer-consumer framework. It is difficult to directly explain what this framework does, but it can be understood as Java BlockingQueue. This is not much easier to understand, this is a producer - consumer queue, but its performance than the BloockingQueuemuch better known as a single machine can have millions of TPS.

Disruptor features

Disruptor has the following three main characteristics:

  • Event multicast
  • Allocate memory in advance for events
  • Lock-free operation

Generally, when we use the queue, the message in the queue will be 消费者used by only one , but in the Disruptor, the same message can be processed by multiple consumers at the same time, and multiple consumers are parallel.

Because at the same time, a piece of data must be used in multiple places, for example, a request data needs to be stored in the log , backed up to a remote machine , and processed for business , as shown in the following figure:

If a message can only be processed by one consumer, then the above three processing logic must be completed linearly or put into a consumer and then processed asynchronously. If these operations can be split into multiple consumers to consume the same message in parallel , the processing efficiency will be much improved.

In this way, it is necessary to ensure that a message is processed by all consumers before starting to process the next one. Consumers cannot process different messages at the same time, so tools like CyclicBarrier in Java are needed to ensure that all consumers can process the next message at the same time. , SequenceBarrier is implemented in Disruptor to accomplish this function.

The goal of Disruptor is to apply in a low-latency environment. In low-latency systems, it is necessary to reduce or not to allocate memory at all. In Java, it is to reduce the pause time caused by garbage collection. In Disruptor, use RingBuffer to achieve this goal, create objects in RingBuffer in advance, and then use these objects repeatedly to avoid garbage collection. This implementation is thread-safe.

In Disruptor, the implementation of thread safety basically does not use locks, but uses lock-free mechanisms such as CAS to ensure thread safety.

Key concept

Before we formally understand Disruptor, we need to understand some core concepts. The code of Disruptor is not complicated. It basically revolves around these core concepts.

  • Producer: The producer of the data, the producer itself has nothing to do with the Disruptor, it can be any code that produces data, or even a for loop
  • Event: Producer produces data that users can define according to their own needs
  • RingBuffer: Used to store the data structure of Event, allocate memory in advance, and avoid creating objects during the running of the program
  • EventHandler: Consumer, implemented by users themselves
  • Sequence: used to identify the components in the Disruptor, the coordination between multiple components depends on it
  • Sequencer: The core mechanism in Disruptor, which implements the core concurrency algorithm to ensure the correct delivery of messages between producers and consumers
  • SequenceBarrier: used to ensure that all consumers can process new messages at the same time
  • WaitStrategy: Consumer waiting strategy
  • EventProcessor: concrete implementation of delivering messages to consumers

The above components make up the Disruptor. The code size of the entire framework is actually very small, it should be less than 7000 lines, and the code is very clean. The code basically does not use inheritance, but uses interface-oriented programming and composition , so the coupling between the codes is very low.

RingBuffer and Sequencer are the two most important components. The former is used to store messages, and the latter controls the orderly production and consumption of messages.

The core goal of Disruptor is to improve the throughput of the program, so the program is also achieved around these goals, mainly doing the following things:

  • Reduce garbage collection
  • Allow messages to be processed in parallel by multiple consumers
  • Use lock-free algorithms to achieve concurrency
  • Cache line filling

RingBuffer is a container for storing messages, and the internal implementation uses an array:

private final Object[] entries;
复制代码

Before the Disruptor starts, you need to specify the size of the data and initialize this array:

private void fill(EventFactory<E> eventFactory)
{
    for (int i = 0; i < bufferSize; i++)
    {
        entries[BUFFER_PAD + i] = eventFactory.newInstance();
    }
}
复制代码

After the array is initialized, it is no longer recycled. All messages will recycle these already created objects, so this is a circular array . The implementation of RingBuffer is shown in the following figure:

Then, when operating on the circular array, it is necessary to control the access of the producer and the consumer to the array. On the one hand, because Disruptor supports multiple producers and multiple consumers, it is necessary to ensure thread safety. In order to ensure performance, no lock is used to ensure thread safety (only BlockingWaitStrategy uses locks). In the access control of RingBuffer, Mainly use CAS to complete:

protected final E elementAt(long sequence)
{
    return (E) UNSAFE.getObject(entries, REF_ARRAY_BASE + ((sequence & indexMask) << REF_ELEMENT_SHIFT));
}
复制代码

On the other hand, access to the speed of producers and consumers, producers can not write unconsumed messages, which will cause the loss of messages. In RingBuffer, the head and tail pointers are not used for control, but through Sequence To control, when the producer writes data, the current serial number plus the amount of data to be written will be compared with the position of the consumer to see if there is enough space to write.

In RingBuffer, there is this piece of code:

abstract class RingBufferPad
{
    protected long p1, p2, p3, p4, p5, p6, p7;
}
复制代码

This code is called cache line filling. When it comes to this, you need to understand the cache mechanism of the CPU. Because the access speed of the memory is too far away from the speed of the CPU, the CPU cache is also added between the CPU and the memory. Now Level 3 is generally added, the first level and the second level are exclusive to the CPU core, and the third level cache is shared among multiple cores.

In many cases, we want to cache some values ​​that will not change into the CPU cache, such as the final variable in Java, so that the speed of the CPU cache can be maximized, but the CPU cache has a feature, when caching data will CPU cache line as a unit, so if you define a variable will be close to a final change of variables, the variables change each time, the data will be written back to memory again, then the final variable is also no longer in the cache The CPU is cached, so you need to fill in the front and back of the cache line to ensure that no other data is cached:

abstract class RingBufferPad
{
    // 填充缓存行的前部分
    protected long p1, p2, p3, p4, p5, p6, p7;
}
abstract class RingBufferFields extends RingBufferPad{ 
    ......
    // 下面需要被缓存到 CPU 缓存的数据
    private final long indexMask; 
    private final Object[] entries; 
    protected final int bufferSize;
    protected final Sequencer sequencer; 
    ...... 
}
public final class RingBuffer extends RingBufferFields implements Cursored, EventSequencer, EventSink{ 
    ...... 
    // 填充缓存行的后部分
    protected long p1, p2, p3, p4, p5, p6, p7; 
    ......
}
复制代码

In this way, after the data of RingBufferFields is loaded into the CPU cache, there is no need to read from the memory.

Disruptor improves performance through various measures, which is why it is so fast.

original

Follow the WeChat public account and talk about other

Guess you like

Origin juejin.im/post/5e99cb25e51d4546ee76cda8