Thoroughly get to know snowflake algorithm Baidu and best practices of US corporations

EDITORIAL words

Mention distributed ID automatically generated program, I am sure you are very familiar with, and immediately be able to tell several options of their own good, indeed, as the system ID data is important to identify the importance of self-evident, but also various programs after multiple generations of optimization, please allow me to use this perspective to automatically classify distributed generation scheme ID:

Method to realize

  • Data source completely dependent manner

Generation rule ID is read by the data source control completely controlled, a common self-ID database growth, serial number, or the Redis INCR / INCRBY atomic generation sequence number.

  • Semi-dependent manner the data source

ID generation rule, in part by the need to control the growth factor data source (or configuration information), such as a snowflake algorithm.

  • Source data independent manner

ID generation rule calculated by the machine completely independent information, not dependent on any data recording and configuration information, such as the common UUID, GUID, etc.

Practices

Practices applicable to the three ways mentioned above, can be used as a complement to these three implementations, designed to improve system throughput, but the limitations of the original implementation remain.

  • Real-time access program

As the name suggests, each time to obtain ID, generated in real time.
Simple and quick, ID is uninterrupted, but may not be the highest throughput.

  • Pre-generation scheme

Pre-generated data on the number of ID pool can be simply generated from the growth, the step size may be provided, generates a batch, these need to be generated in advance data on the storage container (JVM memory, the Redis, database tables can be ).
Can dramatically improve throughput compared, but the need to open up space for temporary storage may be lost after power outages have ID, ID might be interrupted.

About the program

The following current popular distributed ID programs do a brief introduction

  1. Since growth ID Database

Belonging manner entirely dependent data sources, all stored in the database ID, ID generation is the most common way, in the period of application of the monomer has been most widely used to build a database using data tables when carrying auto_increment primary key, or It is to use the sequence to complete a number of demand growth in other scenes from the ID.

  • Pros: Very simple, incremental and orderly, convenient paging and sorting.
  • Disadvantages: sub-library sub-table, the same data table easily repeatable increment ID can not be used directly (step may be provided, but it is clear that limitations); low throughput of the whole performance, if designed to achieve a single distributed database the unique data applications, even with pre-generation scheme, also because transactional lock problem, a single point of high concurrency bottleneck scene easily.
  • Application scenarios: a single database table instance ID (scene comprising master-slave synchronization), part of the serial number and the like by day count; scene sub-library sub-table, a system-wide unique ID of the scene does not apply.
  1. Generation ID Redis

Also belong to the data source totally dependent manner by the Redis INCR / INCRBY atomic increment operation command, to ensure that the generated unique ID must be ordered, implementations consistent with the nature database.

  • Pros: higher overall throughput than the database.
  • Disadvantages: Redis instance or cluster downtime and find new ID value a bit difficult.
  • Application scenarios: compare count scenes, such as a user visits, serial number order (date + serial number) and the like.
  1. UUID, GUID generation ID

UUID: OSF calculated in accordance with the standards set, use the Ethernet card address, nanosecond time, the chip ID code and a number of possible numbers. The current date and time (UUID the first part of the time-related, then if you generate a UUID, a few seconds and generate a UUID, the first part of a different, remaining the same), the clock: a combination of the following parts of sequence, IEEE globally unique machine identification number (if any card, obtained from the card, the card is not obtained otherwise)

GUID: Microsoft's implementation of the UUID standard. There are other various implementations UUID, GUID more than one of not list them.

Both belong to the data source does not depend on the way, truly globally unique ID

  • Advantages: does not depend on any data source, own calculations, there is no network ID, ultrafast speed and globally unique.
  • Cons: no sequential and long (128bit), as a database primary key index index will lead to reduced efficiency, take up more space.
  • Application scenarios: as long as no storage space demanding can be applied, such as various links tracking log storage.

4, snowflake algorithm (algorithm snow) generated ID

Semi-dependent data source mode, principle is to use a Long (64), is filled in accordance with certain rules: time (in milliseconds) + cluster ID + the machine ID + serial number, each part occupied bits may need to be allocated according to the actual, wherein cluster ID and the machine ID two parts, in the practical application scenarios dependent parameters or external database records.

  • Advantages: High-performance, low-latency, to the center, in chronological order
  • Disadvantages: machine clock synchronization requirements (the second stage can)
  • Applicable scene: the primary key data distributed application environment

Snow ID algorithm does not sound particularly suitable for distributed architecture scene? According to the present situation is, then we focus on explaining its principles and best practices.

snowflake algorithm principle

snowflake algorithm comes from Twitter, using the scala language, using the framework to achieve Thrift RPC interface call, the cause of the initial project is migrated from mysql database to Cassandra, Cassandra is not readily available ID generation mechanism, it gave birth to this project, the existing github Source are interested can go and see.

Snowflake characteristics are sequential algorithm, unique, and require high performance, low latency (each machine at least 10k of data generated per second, and the response time less than 2ms), to be in a distributed environment (multi-cluster, inter-room) used, so the algorithm ID snowflake segment is composed of:

  • The specified date and time difference (millisecond), 41, 69 years enough
  • Cluster ID + machine ID, 10 bit machines support up to 1024
  • Sequence, 12, each machine to produce up to 4096 ms in each sequence number

as the picture shows:
snowflake structure

  • 1bit: sign bit, is fixed to 0, indicating that all are positive integers ID
  • 41bit: the number of milliseconds the time difference, counting from the date specified, enough for 69 years, we know that the time stamp indicates the type with Long 1970-01-01 00:00:00 counting from the beginning, where we can timestamp specified date, such as 2019-10-23 00:00:00
  • 10bit: the machine ID, off-site deployment, multi-cluster can also be configured, required under the planned line around the room, each cluster, each instance ID number
  • 12bit: sequence ID, then in front of the same, it can support up to 4096

More than the number of bits allocated only official recommendations, we can according to the actual needs of their own distribution, for example, we use the number of machines at most dozens, but a large number of concurrent, we can reduce the 10bit to 8bit, sequence from part 12bit 14bit etc. increased

Of course, the meaning of each portion may be freely replaced, as the intermediate portion of the machine ID, if cloud computing environment of the deployment of the container, at any time expansion, reduction of the machine operation, through the line to the plan configuration example of ID is not realistic, examples can be replaced with each restart time, to take a self-ID as the content of the growth part, it will be explained below.

There are also big on github God in Java did snowflake most basic implementation, directly view the source code here:
snowflake version of the Java source code

/**
 * twitter的snowflake算法 -- java实现
 * 
 * @author beyond
 * @date 2016/11/26
 */
public class SnowFlake { /** * 起始的时间戳 */ private final static long START_STMP = 1480166465631L; /** * 每一部分占用的位数 */ private final static long SEQUENCE_BIT = 12; //序列号占用的位数 private final static long MACHINE_BIT = 5; //机器标识占用的位数 private final static long DATACENTER_BIT = 5;//数据中心占用的位数 /** * 每一部分的最大值 */ private final static long MAX_DATACENTER_NUM = -1L ^ (-1L << DATACENTER_BIT); private final static long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT); private final static long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT); /** * 每一部分向左的位移 */ private final static long MACHINE_LEFT = SEQUENCE_BIT; private final static long DATACENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT; private final static long TIMESTMP_LEFT = DATACENTER_LEFT + DATACENTER_BIT; private long datacenterId; //数据中心 private long machineId; //机器标识 private long sequence = 0L; //序列号 private long lastStmp = -1L;//上一次时间戳 public SnowFlake(long datacenterId, long machineId) { if (datacenterId > MAX_DATACENTER_NUM || datacenterId < 0) { throw new IllegalArgumentException("datacenterId can't be greater than MAX_DATACENTER_NUM or less than 0"); } if (machineId > MAX_MACHINE_NUM || machineId < 0) { throw new IllegalArgumentException("machineId can't be greater than MAX_MACHINE_NUM or less than 0"); } this.datacenterId = datacenterId; this.machineId = machineId; } /** * 产生下一个ID * * @return */ public synchronized long nextId() { long currStmp = getNewstmp(); if (currStmp < lastStmp) { throw new RuntimeException("Clock moved backwards. Refusing to generate id"); } if (currStmp == lastStmp) { //相同毫秒内,序列号自增 sequence = (sequence + 1) & MAX_SEQUENCE; //同一毫秒的序列数已经达到最大 if (sequence == 0L) { currStmp = getNextMill(); } } else { //不同毫秒内,序列号置为0 sequence = 0L; } lastStmp = currStmp; return (currStmp - START_STMP) << TIMESTMP_LEFT //时间戳部分 | datacenterId << DATACENTER_LEFT //数据中心部分 | machineId << MACHINE_LEFT //机器标识部分 | sequence; //序列号部分 } private long getNextMill() { long mill = getNewstmp(); while (mill <= lastStmp) { mill = getNewstmp(); } return mill; } private long getNewstmp() { return System.currentTimeMillis(); } public static void main(String[] args) { SnowFlake snowFlake = new SnowFlake(2, 3); for (int i = 0; i < (1 << 12); i++) { System.out.println(snowFlake.nextId()); } } }

Substantially by the displacement operation, meaning the value of each segment to move to the corresponding position, such as the machine ID from the data center where the machine identification composed +, therefore, the machine identification 12 to the left, that is, its location, the data center No. 17 to the left, to the left by the value of the time stamp 22, each occupying part of their position, each of the non-interference, thereby to form a complete ID value.

The principle here is the snowflake most basic, if somewhat basic knowledge of java can not remember a suggestion to check information, such as for binary-1 is 0xffff (which are all 1), << represents a left shift operation, -1 equal to 5 << -32, -1 ^ XOR operation (-1 << 5) was 31 and the like.

Snowflake understand the basic implementation principle, can be achieved by planning in advance to identify a good machine, but the current distributed production environment, to borrow a variety of cloud computing, container technology, the number of instances of changes at any time, also need to handle server instances clock callback problem, then use a fixed plan ID by configuring snowflake scene feasibility is not high, usually automatic start and stop, increase or decrease in the machine, so you need to make some snowflake transformation can be better applied to the production environment.

Baidu uid-generator project

UidGenerator item based on the principle implemented snowflake, only the modified portion of the machine ID is defined (number of instances of restart), and 64-bit bit allocation configuration support, default official distribution as shown below:

The default snowflake structure Baidu implemented

Snowflake Algorithm Description: Specifies & machine & same time a complicated sequence is unique. Thus 64 bits may generate a unique ID (long).

  • Sign (1bit) 1bit fixed identification symbol, i.e. the generated UID is a positive number.
  • delta seconds (28 bits)
    the current time, relative to the time point "2016-05-20" incremental value, unit: second, can support up to about 87 years.
  • worker id (22 bits) machine id, can support up to 420w times about the machine starts. Built to achieve when starting from the distribution database, the default assignment policy is disposable, can provide subsequent reuse strategies.
  • sequence (13 bits) per second under the complicated sequence, 13 bits per second can support concurrent 8192.

Specific implementation, there are two, one is generated in real time ID, another embodiment is pre-generated ID

  1. DefaultUidGenerator
  • Examples of current inserted to the database tables WORKER_NODE start IP, Port and other information, then the growth of the data obtained from the machine ID as the ID portion.
    Simplified flowchart follows:

UidGenerator boot process

  • To provide a method to get the ID, and detects whether the clock call-back, call-back phenomenon has a direct throw an exception, the current version does not support the operation after the clock drift along the dial. Simplified flowchart follows:

UidGenerator generation process

The core code is as follows:

    /**
     * Get UID
     *
     * @return UID
     * @throws UidGenerateException in the case: Clock moved backwards; Exceeds the max timestamp
     */
    protected synchronized long nextId() { long currentSecond = getCurrentSecond(); // Clock moved backwards, refuse to generate uid if (currentSecond < lastSecond) { long refusedSeconds = lastSecond - currentSecond; throw new UidGenerateException("Clock moved backwards. Refusing for %d seconds", refusedSeconds); } // At the same second, increase sequence if (currentSecond == lastSecond) { sequence = (sequence + 1) & bitsAllocator.getMaxSequence(); // Exceed the max sequence, we wait the next second to generate uid if (sequence == 0) { currentSecond = getNextSecond(lastSecond); } // At the different second, sequence restart from zero } else { sequence = 0L; } lastSecond = currentSecond; // Allocate bits for UID return bitsAllocator.allocate(currentSecond - epochSeconds, workerId, sequence); }
  1. CachedUidGenerator

The method of obtaining the machine on one and the same ID, a group ID which is generated in advance, in a RingBuffer annular array, for clients to use, when the available data is below the threshold, generate batch method again call belongs with space for time approach can improve the throughput of the entire ID.

  • Compared with DefaultUidGenerator, a plurality of logical stuffing RingBuffer annular array initialization, a simple flow chart is as follows:

CachedUidGenerator boot process

Core code:

/**
     * Initialize RingBuffer & RingBufferPaddingExecutor
     */
    private void initRingBuffer() { // initialize RingBuffer int bufferSize = ((int) bitsAllocator.getMaxSequence() + 1) << boostPower; this.ringBuffer = new RingBuffer(bufferSize, paddingFactor); LOGGER.info("Initialized ring buffer size:{}, paddingFactor:{}", bufferSize, paddingFactor); // initialize RingBufferPaddingExecutor boolean usingSchedule = (scheduleInterval != null); this.bufferPaddingExecutor = new BufferPaddingExecutor(ringBuffer, this::nextIdsForOneSecond, usingSchedule); if (usingSchedule) { bufferPaddingExecutor.setScheduleInterval(scheduleInterval); } LOGGER.info("Initialized BufferPaddingExecutor. Using schdule:{}, interval:{}", usingSchedule, scheduleInterval); // set rejected put/take handle policy this.ringBuffer.setBufferPaddingExecutor(bufferPaddingExecutor); if (rejectedPutBufferHandler != null) { this.ringBuffer.setRejectedPutHandler(rejectedPutBufferHandler); } if (rejectedTakeBufferHandler != null) { this.ringBuffer.setRejectedTakeHandler(rejectedTakeBufferHandler); } // fill in all slots of the RingBuffer bufferPaddingExecutor.paddingBuffer(); // start buffer padding threads bufferPaddingExecutor.start(); }
public synchronized boolean put(long uid) { long currentTail = tail.get(); long currentCursor = cursor.get(); // tail catches the cursor, means that you can't put any cause of RingBuffer is full long distance = currentTail - (currentCursor == START_POINT ? 0 : currentCursor); if (distance == bufferSize - 1) { rejectedPutHandler.rejectPutBuffer(this, uid); return false; } // 1. pre-check whether the flag is CAN_PUT_FLAG int nextTailIndex = calSlotIndex(currentTail + 1); if (flags[nextTailIndex].get() != CAN_PUT_FLAG) { rejectedPutHandler.rejectPutBuffer(this, uid); return false; } // 2. put UID in the next slot // 3. update next slot' flag to CAN_TAKE_FLAG // 4. publish tail with sequence increase by one slots[nextTailIndex] = uid; flags[nextTailIndex].set(CAN_TAKE_FLAG); tail.incrementAndGet(); // The atomicity of operations above, guarantees by 'synchronized'. In another word, // the take operation can't consume the UID we just put, until the tail is published(tail.incrementAndGet()) return true; }
  • ID acquisition logic, the buffer array due to the presence of RingBuffer, taken directly from the acquired ID RingBuffer can also verify itself when RingBuffer generate batch can then re-trigger, and the significant difference value DefaultUidGenerator ID acquired here, acquisition DefaultUidGenerator the ID, time stamp is part of the current time, CachedUidGenerator in obtaining a timestamp when filling time is not acquired, but little to do, are not repeated, using the same. Simplified flowchart follows:

CachedUidGenerator acquisition process

Core code:

public long take() { // spin get next available cursor long currentCursor = cursor.get(); long nextCursor = cursor.updateAndGet(old -> old == tail.get() ? old : old + 1); // check for safety consideration, it never occurs Assert.isTrue(nextCursor >= currentCursor, "Curosr can't move back"); // trigger padding in an async-mode if reach the threshold long currentTail = tail.get(); if (currentTail - nextCursor < paddingThreshold) { LOGGER.info("Reach the padding threshold:{}. tail:{}, cursor:{}, rest:{}", paddingThreshold, currentTail, nextCursor, currentTail - nextCursor); bufferPaddingExecutor.asyncPadding(); } // cursor catch the tail, means that there is no more available UID to take if (nextCursor == currentCursor) { rejectedTakeHandler.rejectTakeBuffer(this); } // 1. check next slot flag is CAN_TAKE_FLAG int nextCursorIndex = calSlotIndex(nextCursor); Assert.isTrue(flags[nextCursorIndex].get() == CAN_TAKE_FLAG, "Curosr not in can take status"); // 2. get UID from next slot // 3. set next slot flag as CAN_PUT_FLAG. long uid = slots[nextCursorIndex]; flags[nextCursorIndex].set(CAN_PUT_FLAG); // Note that: Step 2,3 can not swap. If we set flag before get value of slot, the producer may overwrite the // slot with a new UID, and this may cause the consumer take the UID twice after walk a round the ring return uid; }

In addition there is a detail you can find out, RingBuffer data are using an array to store, consider the CPU Cache structure, tail and cursor variable if the direct use AtomicLong type native, tail and cursor may be cached in the same cacheLine in more than thread reads the variable may cause request CacheLine of RFO, but affect the performance, in order to prevent false sharing issue, deliberately filled six member variable of type long, with long type of the value member variable, just fill a Cache Line (Java objects as well as objects 8byte head), this is called CacheLine filled, we are interested can look at the source code as follows:

public class PaddedAtomicLong extends AtomicLong { private static final long serialVersionUID = -3415778863941386253L; /** Padded 6 long (48 bytes) */ public volatile long p1, p2, p3, p4, p5, p6 = 7L; /** * Constructors from {@link AtomicLong} */ public PaddedAtomicLong() { super(); } public PaddedAtomicLong(long initialValue) { super(initialValue); } /** * To prevent GC optimizations for cleaning unused padded references */ public long sumPaddingToPreventOptimization() { return p1 + p2 + p3 + p4 + p5 + p6; } }

These are mainly described Baidu uid-generator project, we can see that, snowflake algorithm there are some changes in the landing, mainly in the acquisition of the machine ID, especially distributed cluster environment below, automatic retractable example, some containers of docker technology, so that the static configuration item ID, instance ID feasibility is not high, the number of starts by these conversions to be identified.

US group ecp-uid project

In uidGenerator regard, the US group project source code source directly integrated Baidu, slightly Lambda expressions into some native java syntax, for example:

// com.myzmds.ecp.core.uid.baidu.impl.CachedUidGenerator类的initRingBuffer()方法
// 百度源码
this.bufferPaddingExecutor = new BufferPaddingExecutor(ringBuffer, this::nextIdsForOneSecond, usingSchedule); // 美团源码 this.bufferPaddingExecutor = new BufferPaddingExecutor(ringBuffer, new BufferedUidProvider() { @Override public List<Long> provide(long momentInSecond) { return nextIdsForOneSecond(momentInSecond); } }, usingSchedule);

And generating a machine ID, the introduction of the Zookeeper, Redis these components, and to generate enriched Obtaining the machine ID, the instance number can be used repeatedly stored, no longer monotonically increasing this kind of database.

Conclusion

Benpian introduces the principle and floor renovation process snowflake algorithm, and in this study the excellent open-source code and pick out portions of a simple example, ecp-uid project US group not only integrates Baidu existing UidGenerator algorithm, snowflake algorithm native, also contains excellent leaf segment algorithm, given the length is not an exhaustive description. In the article has any incorrect or exhaustive place, please leave a message pointed out, thank you.

 

Guess you like

Origin www.cnblogs.com/cider/p/11776088.html