Baidu uid-generator source

snowflake algorithm

uid-generator is implemented snowflake Twitter algorithm based on open source.

The 64-bit long snowflake divided into three parts, a timestamp, and sequence number id working machine, the number of bits assigned as follows.

Wherein the time stamp portion of the unit is generally milliseconds. That is a station working machine 1 can produce 4096 ms id (12 th power of 2).

Source implementation analysis

Snowflake different from the original algorithm, uid-generator support custom stamp bits of each part, the working machine id and serial number, to be applied to different scenarios. The default allocation is as follows.

Sign (1bit)
fixed 1bit identification symbol, i.e. the generated UID is a positive number.
delta seconds (28 bits)
of the current time, relative to "2016-05-20" point of time increment value, unit: seconds, can support up to about 87 years (Note: where 1 unit is seconds rather than milliseconds! 2. Note the wording here is "up to" support 8.7 years, why is the "most", will speak later)
worker id (22 bits)
machine id, can support up to 420w times about the machine starts. Built to achieve when starting from the distribution database, the default assignment policy is disposable, can provide subsequent reuse strategies.
sequence (13 bits)
concurrent sequence in the second, 13 bits per second can support concurrent 8192. (Note that this place, qps default support up to 8192)

DefaultUidGenerator

The method of generating the id DefaultUidGenerator substantially snowflake algorithm is a common, only some differences, such as in units of seconds rather than milliseconds.

The method of generating DefaultUidGenerator id follows.

protected synchronized long nextId() {
        long currentSecond = getCurrentSecond();

        // Clock moved backwards, refuse to generate uid
        if (currentSecond < lastSecond) {
            long refusedSeconds = lastSecond - currentSecond;
            throw new UidGenerateException("Clock moved backwards. Refusing for %d seconds", refusedSeconds);
        }

        // At the same second, increase sequence
        if (currentSecond == lastSecond) {
            sequence = (sequence + 1) & bitsAllocator.getMaxSequence();
            // Exceed the max sequence, we wait the next second to generate uid
            if (sequence == 0) {
                currentSecond = getNextSecond(lastSecond);
            }

        // At the different second, sequence restart from zero
        } else {
            sequence = 0L;
        }

        lastSecond = currentSecond;

        // Allocate bits for UID
        return bitsAllocator.allocate(currentSecond - epochSeconds, workerId, sequence);
    }

CachedUidGenerator

CachedUidGenerator support cache id generation.

The basic implementation principle

About CachedUidGenerator, the document is presented.

In the realization, UidGenerator solved by a naturally occurring sequence concurrency limit borrowing time in the future; using RingBuffer to cache UID, UID parallelization of production and consumption generated, while CacheLine filled, avoiding the hardware level brought about by the RingBuffer "false sharing" issue. The final single QPS up to 600 million.

[Adopted RingBuffer to cache the generated UID, UID parallelization of production and consumption]

Use id RingBuffer cache generated. RingBuffer annular array is the default size is 8192, which buffers the generated id.

Get id

It will take an id from ringbuffer in support concurrent acquisition

Filling id

RingBuffer filled with opportunity

When the program starts, the RingBuffer filled up, the cache id 8192
When calling getUID () Gets id, the detected RingBuffer id number of the remaining less than 50% of the total number of the RingBuffer filled up, so that the cache id 8192
Filling the timing (used to configure whether tasks and periodic timing)

[UidGenerator to solve concurrency limit naturally occurring sequence by borrowing next time]

Because delta seconds portion is in seconds, so that a worker at most 1 second id generated book 8192 (13th power of 2).

From the above, the maximum supported qps 8192, so to improve throughput by caching id.

Why is it called the aid of a future time?

Because id generated per second up to 8192, when the number is more than one second acquisition id 8192, RingBuffer the id rapidly been consumed, when filled RingBuffer, id generated the delta seconds using only part of a future time.

(Because of the use of time to generate future id, so that the above is that [most] can support approximately 8.7 years)

Source analysis

Get id

   @Override
    public long getUID() {
        try {
            return ringBuffer.take();
        } catch (Exception e) {
            LOGGER.error("Generate unique id exception. ", e);
            throw new UidGenerateException(e);
        }
    }

RingBuffer cache generated id

(Note: RingBuffer not Disruptor framework here in RingBuffer, but with a lot Disruptor in RingBuffer design ideas, such as using a cache line fill to solve the false sharing issue)

RingBuffer annular array, the default capacity to accommodate the maximum sequence (8192), may be provided by boostPower size parameters.

the tail pointer, Cursor slot pointer for reading and writing on the annular array:

Tail pointer
indicates the maximum number Producer production (this number starts from 0, continued to increase). Tail not exceed Cursor, namely producers can not cover the unconsumed slot. When Tail has caught curosr, this time can be specified by rejectedPutBufferHandler PutRejectPolicy
Cursor pointer
indicates Consumer consumption to a minimum number (the sequence number Producer sequence identity). Cursor can not exceed Tail, that is, not consumption not production slot. When Cursor has caught tail, this time can be specified by rejectedTakeBufferHandler TakeRejectPolicy

CachedUidGenerator a dual RingBuffer, Uid-RingBuffer for storing Uid, Flag-RingBuffer Uid for storing status (whether filled, whether consumption)

Since the array elements in memory are allocated contiguously, maximize the use of available CPU cache to improve performance. But it will also bring FalseSharing problem "false sharing", used for this purpose CacheLine filled way Tail, Cursor pointer, Flag-RingBuffer in.

public class RingBuffer {
    private static final Logger LOGGER = LoggerFactory.getLogger(RingBuffer.class);

    /** Constants */
    private static final int START_POINT = -1; 
    private static final long CAN_PUT_FLAG = 0L; //用于标记当前slot的状态，表示可以put一个id进去
    private static final long CAN_TAKE_FLAG = 1L; //用于标记当前slot的状态，表示可以take一个id
    public static final int DEFAULT_PADDING_PERCENT = 50; //用于控制何时填充slots的默认阈值：当剩余的可用的slot的个数，小于bufferSize的50%时，需要生成id将slots填满

    /** The size of RingBuffer's slots, each slot hold a UID */
    private final int bufferSize; //slots的大小，默认为sequence可容量的最大值，即8192个
    private final long indexMask; 
  
    private final long[] slots;  //slots用于缓存已经生成的id
    private final PaddedAtomicLong[] flags; //flags用于存储id的状态(是否可填充、是否可消费)

    /** Tail: last position sequence to produce */
    //Tail指针
    //表示Producer生产的最大序号(此序号从0开始，持续递增)。Tail不能超过Cursor，即生产者不能覆盖未消费的slot。当Tail已赶上curosr，此时可通过rejectedPutBufferHandler指定PutRejectPolicy
    private final AtomicLong tail = new PaddedAtomicLong(START_POINT); //

    /** Cursor: current position sequence to consume */
    //表示Consumer消费到的最小序号(序号序列与Producer序列相同)。Cursor不能超过Tail，即不能消费未生产的slot。当Cursor已赶上tail，此时可通过rejectedTakeBufferHandler指定TakeRejectPolicy
    private final AtomicLong cursor = new PaddedAtomicLong(START_POINT);

    /** Threshold for trigger padding buffer*/
    private final int paddingThreshold; //用于控制何时填充slots的阈值
    
    /** Reject put/take buffer handle policy */
    //当slots满了，无法继续put时的处理策略。默认实现：无法进行put，仅记录日志
    private RejectedPutBufferHandler rejectedPutHandler = this::discardPutBuffer;
    //当slots空了，无法继续take时的处理策略。默认实现：仅抛出异常
    private RejectedTakeBufferHandler rejectedTakeHandler = this::exceptionRejectedTakeBuffer; 
    
    /** Executor of padding buffer */
    //用于运行【生成id将slots填满】任务
    private BufferPaddingExecutor bufferPaddingExecutor;

RingBuffer filled with opportunity

When the program starts, the RingBuffer filled up, the cache id 8192
When calling getUID () Gets id, the detected RingBuffer id number of the remaining less than 50% of the total number of the RingBuffer filled up, so that the cache id 8192
Filling the timing (used to configure whether tasks and periodic timing)

Filling RingBuffer

 /**
     * Padding buffer fill the slots until to catch the cursor
     */
    public void paddingBuffer() {
        LOGGER.info("Ready to padding buffer lastSecond:{}. {}", lastSecond.get(), ringBuffer);

        // is still running
        if (!running.compareAndSet(false, true)) {
            LOGGER.info("Padding buffer is still running. {}", ringBuffer);
            return;
        }

        // fill the rest slots until to catch the cursor
        boolean isFullRingBuffer = false;
        while (!isFullRingBuffer) {
            //获取生成的id，放到RingBuffer中。
            List<Long> uidList = uidProvider.provide(lastSecond.incrementAndGet());
            for (Long uid : uidList) {
                isFullRingBuffer = !ringBuffer.put(uid);
                if (isFullRingBuffer) {
                    break;
                }
            }
        }

        // not running now
        running.compareAndSet(true, false);
        LOGGER.info("End to padding buffer lastSecond:{}. {}", lastSecond.get(), ringBuffer);
    }

Generate id (uidProvider.provide call this method is the above code)

    /**
     * Get the UIDs in the same specified second under the max sequence
     * 
     * @param currentSecond
     * @return UID list, size of {@link BitsAllocator#getMaxSequence()} + 1
     */
    protected List<Long> nextIdsForOneSecond(long currentSecond) {
        // Initialize result list size of (max sequence + 1)
        int listSize = (int) bitsAllocator.getMaxSequence() + 1;
        List<Long> uidList = new ArrayList<>(listSize);

        // Allocate the first sequence of the second, the others can be calculated with the offset
        //这里的实现很取巧
        //因为1秒内生成的id是连续的，所以利用第1个id来生成后面的id，而不用频繁调用snowflake算法
        long firstSeqUid = bitsAllocator.allocate(currentSecond - epochSeconds, workerId, 0L);
        for (int offset = 0; offset < listSize; offset++) {
            uidList.add(firstSeqUid + offset);
        }

        return uidList;
    }

Fill cache line to solve the "false sharing"

About false sharing, you can refer to the article " false sharing (false sharing), concurrent programming silent performance killer ."

    //数组在物理上是连续存储的，flags数组用来保存id的状态（是否可消费、是否可填充），在填入id和消费id时，会被频繁的修改。
    //如果不进行缓存行填充，会导致频繁的缓存行失效，直接从内存中读数据。
    private final PaddedAtomicLong[] flags;

    //tail和cursor都使用缓存行填充，是为了避免tail和cursor落到同一个缓存行上。
    /** Tail: last position sequence to produce */
    private final AtomicLong tail = new PaddedAtomicLong(START_POINT);

    /** Cursor: current position sequence to consume */
    private final AtomicLong cursor = new PaddedAtomicLong(START_POINT)

/**
 * Represents a padded {@link AtomicLong} to prevent the FalseSharing problem<p>
 * 
 * The CPU cache line commonly be 64 bytes, here is a sample of cache line after padding:<br>
 * 64 bytes = 8 bytes (object reference) + 6 * 8 bytes (padded long) + 8 bytes (a long value)
 * @author yutianbao
 */
public class PaddedAtomicLong extends AtomicLong {
    private static final long serialVersionUID = -3415778863941386253L;

    /** Padded 6 long (48 bytes) */
    public volatile long p1, p2, p3, p4, p5, p6 = 7L;

    /**
     * Constructors from {@link AtomicLong}
     */
    public PaddedAtomicLong() {
        super();
    }

    public PaddedAtomicLong(long initialValue) {
        super(initialValue);
    }

    /**
     * To prevent GC optimizations for cleaning unused padded references
     */
    public long sumPaddingToPreventOptimization() {
        return p1 + p2 + p3 + p4 + p5 + p6;
    }

}

PaddedAtomicLong Why such a design?

Refer to the following article

A Java object in the end take up much memory? https://www.cnblogs.com/magialmoon/p/3757767.html

Write Java also have to understand CPU-- false sharing https://www.cnblogs.com/techyc/p/3625701.html

zl1zl2zl3

Published 50 original articles · won praise 1628 · Views 2.03 million +

His message board concerns

Baidu open distributed id generator uid-generator source analysis