Spring Cloud (16): Distributed unique ID for microservices

  • Distributed Unique ID
    • features
    • plan
  • snowflake algorithm
    • features
    • open source implementation
    • Advantages and disadvantages
  • alternative plan
    • UUID
    • Mongdb
    • Set
    • database generation
    • Redis
  • Leaf distributed ID microservice based on meituan
    • Leaf-segment database solution
      Double buffer optimization - TP999 large data fluctuation
      Leaf high availability disaster recovery - DB availability
    • Leaf-snowflake snowflake scheme
      weakly depends on ZooKeeper
  • fix clock problem
    • Comprehensively compare the system time of other Leaf nodes
    • Every once in a while the node will report its own system time and write it to ZooKeeper
    • NTP sync issues with machines

features

  • global uniqueness
  • Increasing trend, monotonically increasing
  • information security

plan

snowflake algorithm

Snowflake divides 64-bit into multiple segments to mark the machine, time, etc.

Features:

  • Bit 0: 符号位(标识正负), always 0, useless, don’t worry
  • Bits 1~41: a total of 41 bits, used to represent 时间戳,单位是毫秒, can support 2 ^41 milliseconds (about 69 years)
  • 42nd to 52nd digits: 10 digits in total. Generally speaking, 前 5 位表示机房 ID,后 5 位表 示机器 ID(it can be adjusted according to the actual situation in the actual project), so that the nodes of different clusters/computer rooms can be distinguished, so that 32 IDCs can be represented, and each IDC can have 32 machines.
  • 53rd~64th digits: a total of 12 digits, used to represent the serial number. The serial number is self-incrementing, representing the maximum number of IDs that a single machine can generate per millisecond (2^12 = 4096), that is to say, a single machine can generate up to 4096 unique IDs per millisecond

Theoretically, the QPS of the snowflake solution is about 409.6w/s. This distribution method can ensure that the IDs generated by any machine in any IDC are different within any millisecond.

Open source implementation based on Snowflake algorithm

  • Meituan's Leaf
  • Baidu's UidGenerator (UidGenerator has basically not been maintained since 18 years, https://github.com/baidu/uid-generator/blob/master/README.zh_cn.md)

These open source implementations are optimizations of the original Snowflake algorithm. In actual projects, we generally also modify the Snowflake algorithm, the most common is to add business type information to the ID generated by the algorithm

Snowflake pros and cons

advantage:

  • The number of milliseconds is in the high position, the self-increment sequence is in the low position, and the entire ID is increasing in trend
  • It does not rely on third-party systems such as databases, and is deployed as a service, with higher stability and high ID generation performance
  • Bits can be allocated according to their own business characteristics, very flexible

shortcoming:

  • Strongly rely on the machine clock, if the clock on the machine is dialed back, it will cause repeated numbering or the service will be unavailable

alternative plan

UUID

UUID.randomUUID() 36 characters of the form 8-4-4-4-12f75d0fbf-77ce-47d0-a2b3-0a7ef4a410b2

advantage:

  • Very high performance: local generation, no network consumption

shortcoming:

  • Not easy to store: UUID is too long, 16 bytes and 128 bits, usually represented by a 36-length string, which is not applicable in many scenarios
  • Information insecurity: The algorithm for generating UUID based on the MAC address may cause the MAC address to be leaked. This vulnerability was once used to find the location of the producer of the Melissa virus
  • When ID is used as the primary key, there will be some problems in specific environments. For example, in the scenario of DB primary key, UUID is very inapplicable:
    • The MySQL official has a clear suggestion that the primary key should be as short as possible, and the UUID with a length of 36 characters does not meet the requirements.
    • MySQL 索引不利: If it is used as the primary key of the database, under the InnoDB engine, the disorder of UUID may cause frequent changes in data location, seriously affecting performance. The clustered index is used in the MySQL InnoDB engine. Since most RDBMSs use the B-tree data structure to store index data, we should try to use ordered primary keys to ensure write performance in the selection of primary keys.

Mongdb objectID

Through "time + machine code + pid + inc" a total of 12 bytes, through 4+3+2+3 way to finally identify a 24-length hexadecimal character

Set

A built-in distributed UUID generator is used to assist in generating global transaction IDs and branch transaction IDs. We can also use them. The full class name is: io.seata.common.util.IdWorker

database generation

  1. Create a database table
CREATE TABLE `sequence_id` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `stub` char(10) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
UNIQUE KEY `stub` (`stub`) comment '字段无意义,只是为了占位; 给 stub 字段创建了唯一索引,保证其 唯一性'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
  1. Insert data by replace into
BEGIN;
REPLACE INTO sequence_id (stub) VALUES ('stub'); 
SELECT LAST_INSERT_ID();
COMMIT;

replace is an enhanced version of insert, replace into first tries to insert data into the table

  • If it is found that this row of data already exists in the table (judged by the primary key or unique index), delete this row of data first, and then insert new data
  • Otherwise, insert new data directly

insert image description here
advantage:

  • Very simple, using the functions of the existing database system to achieve, low cost, professional maintenance by DBA
  • The ID number is monotonically increasing, and the storage space is small

shortcoming:

  • The amount of concurrency supported is not large
  • There is a database single point problem (it can be solved by using a database cluster, but it increases the complexity)
  • ID has no specific business meaning
  • Security issues (for example, the daily order volume can be calculated according to the increment law of the order ID, business secret! )
  • Every time the ID is obtained, the database must be accessed once (increasing the pressure on the database, and the acquisition speed is also slow)

For MySQL performance problems, the following solutions can be used to solve them:

In a distributed system, we can deploy several more machines. Each machine has a different initial value, and the step size is equal to the number of machines. For example, there are two machines. Set the step size to 2

  • The initial value of TicketServer1 is 1 (1, 3, 5, 7, 9, 11...)
  • The initial value of TicketServer2 is 2 (2, 4, 6, 8, 10...)

shortcoming:

  • It is difficult to expand the system horizontally. For example, after defining the step size and the number of machines, if you want to add machines
  • ID does not have the feature of monotonically increasing, and can only increase in trend. This shortcoming is not very important for general business requirements and can be tolerated
  • The pressure on the database is still very high. Every time you get an ID, you have to read and write the database once. You can only rely on stacking machines to improve performance.

Redis

The atomic order of id can be incremented through Redis incrcommands. In order to improve availability and concurrency, we can use Redis Cluster

advantage:

  • Redis is based on memory, and we need to persist data to avoid data loss after machine restart or machine failure. Obviously, the Redis scheme performs well and the generated IDs are sequentially increasing

shortcoming:

  • Redis enables persistence, whether it is a snapshot (snapshotting, RDB), append-only file (append-only file, AOF) or a hybrid persistence of RDB and AOF, there is still the possibility of data loss, which means that the generated ID exists probability of repetition

Leaf distributed ID microservice based on meituan

Leaf-segment database scheme - generate ID with increasing trend, and ID number is computable

DB Tuning:
批量获取,每次获取一个 segment(step 决定大小)号段的值. Go to the database to get a new number segment after use, which can greatly reduce the pressure on the database

Business tuning:
各个业务不同的发号需求用 biz_tag 字段来区分, the ID acquisition of each biz-tag is isolated from each other and does not affect each other. If you need to expand the database due to performance requirements in the future, you don’t need the complex expansion operations described above, you only need to divide the biz_tag database and table.

insert image description here
Now there are 3 machines, and each machine takes 1000

  • The first machine is the number range from 1 to 1000
  • The second machine is the number range from 1001 to 2000
  • The third machine is the number range from 2001 to 3000

When this number section is used up, another number section with a length of step=1000 will be loaded. Assuming that the other two number sections have not been updated, the newly loaded number section of the first machine should be 3001~4000 at this time

At the same time, the max_id of the biz_tag data corresponding to the database will be updated from 3000 to 4000

Begin
UPDATE table SET max_id=max_id+step WHERE biz_tag=xxx;
SELECT tag, max_id, step FROM table WHERE biz_tag=xxx;
Commit

advantage:

  • Leaf services can be easily expanded linearly, and the performance can fully support most business scenarios
  • The ID number is an 8-byte 64-bit number with an increasing trend, which meets the primary key requirements of the above-mentioned database storage
  • High disaster tolerance: The Leaf service has an internal segment cache, even if the DB is down, Leaf can still provide services to the outside world in a short period of time
  • You can customize the size of max_id, which is very convenient for business migration from the original ID method

shortcoming:

  • The ID number is not random enough to leak information about the number of issued numbers, which is not very safe
  • TP999 data fluctuates greatly. When the number segment is used up, there will still be waiting for I/O to update the database when the new number segment is obtained, and the tg999 data will appear occasionally.尖刺(压力瞬增)
  • DB downtime will cause the entire system to be unavailable

Double buffer optimization - TP999 data fluctuates greatly

It is hoped that the process of fetching number segments from DB can be non-blocking, and there is no need to block the request thread when DB is fetching number segments, that is, when a number segment is consumed to a certain point, the next number segment will be asynchronously loaded into memory. There is no need to wait until the number segment is exhausted before updating the number segment. Doing so can greatly reduce the TP999 index of the system

Using the double buffer method, there are two number segment buffer segments inside the Leaf service

  • When 10% of the current number segment has been issued, if the next number segment has not been updated, another update thread will be started to update the next number segment
  • After all the current segment is sent, if the next segment is ready, then switch to the next segment as the current segment and then distribute, repeating the cycle

It is usually recommended that the segment length be set to 600 times (10 minutes) the QPS of the service peak period, so that even if the DB is down, the Leaf can continue to send the number for 10-20 minutes without being affected

Leaf High Availability Disaster Recovery - DB Availability

  • One master and two slaves
  • Extension Room Deployment
  • The semi-synchronous method is used to synchronize data between the Master and the Slave. This scheme will degenerate into an asynchronous mode in some cases, and even in very extreme cases, it will still cause data inconsistency, but the probability of occurrence is very small
  • If you want to ensure 100% data strong consistency, you can choose to use the "Paxos-like algorithm" to achieve a strong consistent MySQL solution

Leaf-snowflake snowflake scheme

The Leaf-snowflake scheme completely follows the bit design of the snowflake scheme, that is, the ID number is assembled in the way of "1+41+10+12"

For workerID allocation, when the number of service clusters is small, it is completely possible to manually configure
the Leaf service. The scale of the leaf service is large, and the cost of manual configuration is too high. Therefore, using the feature of Zookeeper persistent sequential nodes to automatically configure wokerID for snowflake nodes
Leaf-snowflake is started according to the following steps:

  1. Start the Leaf-snowflake service, connect to Zookeeper, and check whether you have registered under the leaf_forever parent node (whether there are child nodes in this order)
  2. If you have registered, get back your workerID directly (the int type ID number generated by the zk sequence node), and start the service
  3. If you have not registered, create a persistent sequence node under the parent node. After the creation is successful, retrieve the sequence number as your workerID number and start the service

Weak dependency on ZooKeeper

In addition to going to ZK to get data every time, a workerID file will also be cached on the local file system. When there is a problem with ZooKeeper, just when the machine has a problem and needs to be restarted, it can ensure that the service can start normally. This achieves a weak dependence on three-party components.

fix clock problem

1. The new node judges whether its own system time is accurate by checking and comparing the system time of other Leaf nodes

  1. Get the service IP:Port of all running Leaf-snowflake nodes
  2. Get the system time of all nodes through RPC request
  3. Calculate sum(time)/nodeSize
  4. Whether the local time and this average are within the threshold
  5. Start the service correctly and normally
  6. Inaccurate startup failure and alarm

2. During the running process, every once in a while the node will report its own system time and write it to ZooKeeper

The old node registered in ZooKeeper will also compare its own system time with the time recorded by this node on ZooKeeper and the time of all running Leaf-snowflake nodes. If it is not accurate, it will also fail to start and alarm

3. During the running of the service, the NTP synchronization of the machine will also cause a fallback of the second level. Due to the strong dependence on the clock, the time requirement is relatively sensitive

  1. Turn off NTP synchronization directly
  2. When the clock is dialed back, no service is provided and ERROR_CODE is returned directly
  3. Do a layer of retry, and then report to the alarm system, or automatically remove its own node and alarm after discovering that there is a clock callback

Guess you like

Origin blog.csdn.net/menxu_work/article/details/128238895