Analysis of Unique ID Generation Scheme in Distributed System

With feelings and dry goods, WeChat search [Desolate Ancient Legend] to pay attention to this different programmer.

Analysis of Unique ID Generation Scheme in Distributed System

In complex distributed systems, it is often necessary to uniquely identify a large amount of data and messages. The business ID needs to meet the following requirements

  • Global uniqueness: Duplicate ID numbers cannot appear. Since it is a unique identifier, this is the most basic requirement.
  • Increasing trend: The clustered index is used in the MySQL InnoDB engine. Since most RDBMSs use the B-tree data structure to store index data, we should try to use ordered primary keys to ensure write performance in the selection of primary keys.
  • Monotonically increasing: Ensure that the next ID must be greater than the previous ID, such as transaction version number, IM incremental message, sorting and other special requirements.
  • Information security: If the ID is continuous, it is very easy for malicious users to pick up the work, just download the specified URL directly in order; if it is an order number, it is even more dangerous, and the competition can directly know our daily order quantity. Therefore, in some application scenarios, IDs need to be irregular and irregular.

UUID

The standard form of UUID (Universally Unique Identifier) ​​contains 32 hexadecimal numbers, divided into five segments by hyphens, and 36 characters in the form of 8-4-4-4-12, example: 550e8400-e29b-41d4-a716 -446655440000, so far there are 5 ways to generate UUID in the industry. For details, see the UUID specification A Universally Unique IDentifier (UUID) URN Namespace released by IETF

advantage:

  • Very high performance: local generation, no network consumption.

shortcoming:

  • Not easy to store: UUID is too long, 16 bytes and 128 bits, usually represented by a 36-length string, which is not applicable in many scenarios.

  • Information insecurity: The algorithm for generating UUID based on the MAC address may cause the MAC address to be leaked. This vulnerability was once used to find the creator of the Melissa virus.

When UUID is used as the primary key, there will be some problems in specific environments. For example, in the scenario of DB primary key, UUID is very inapplicable:

  • The MySQL official has a clear suggestion that the primary key should be as short as possible, and the UUID with a length of 36 characters does not meet the requirements.

All indexes other than the clustered index are known as secondary indexes. In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. InnoDB uses this primary key value to search for the row in the clustered index. If the primary key is long, the secondary indexes use more space, so it is advantageous to have a short primary key.

  • Unfavorable to MySQL index: If it is used as the primary key of the database, under the InnoDB engine, the disorder of UUID may cause frequent changes in data location, which seriously affects performance.
public static function v4() {
    
    
        return sprintf('%04x%04x-%04x-%04x-%04x-%04x%04x%04x',

            // 32 bits for "time_low"
            mt_rand(0, 0xffff), mt_rand(0, 0xffff),

            // 16 bits for "time_mid"
            mt_rand(0, 0xffff),

            // 16 bits for "time_hi_and_version",
            // four most significant bits holds version number 4
            mt_rand(0, 0x0fff) | 0x4000,

            // 16 bits, 8 bits for "clk_seq_hi_res",
            // 8 bits for "clk_seq_low",
            // two most significant bits holds zero and one for variant DCE1.1
            mt_rand(0, 0x3fff) | 0x8000,

            // 48 bits for "node"
            mt_rand(0, 0xffff), mt_rand(0, 0xffff), mt_rand(0, 0xffff)
        );
    }

class snowlack

This scheme is roughly an algorithm that generates IDs by dividing the namespace (UUID is also counted, because it is relatively common, so it is analyzed separately). This scheme divides 64-bit into multiple segments and marks the machine separately. , time, etc. For example, the 64-bit in snowflake are respectively represented as shown in the following figure (picture from the Internet):

snowlack

41-bit time can represent (1L<<41)/(1000L 3600 24*365)=69 years, and 10-bit machines can represent 1024 machines respectively. If we have a need for IDC division, we can also divide 10-bit into 5-bit for IDC and 5-bit for working machines. In this way, 32 IDCs can be represented, and each IDC can have 32 machines, which can be defined according to their own needs. 12 self-incrementing serial numbers can represent 2^12 IDs. In theory, the QPS of the snowflake solution is about 409.6w/s. This distribution method can ensure that any ID generated by any machine in any IDC within any millisecond is different.

The pros and cons of this approach are:

advantage:

The number of milliseconds is in the high position, the self-increment sequence is in the low position, and the entire ID is increasing in trend.

It does not rely on third-party systems such as databases, and is deployed as a service, with higher stability and high performance in generating IDs.

Bits can be allocated according to their own business characteristics, which is very flexible.

shortcoming:

Strongly rely on the machine clock, if the clock on the machine is dialed back, it will cause repeated numbering or the service will be unavailable.

Application example Mongdb objectID:

MongoDB’s official document ObjectID can be regarded as a method similar to snowflake. Through “time + machine code + pid + inc” a total of 12 bytes, it is finally identified as a 24-length hexadecimal by means of 4+3+2+3 make characters

database generation

Taking MySQL as an example, set auto_increment_increment and auto_increment_offset for the field to ensure that the ID is auto-incremented. Each business uses the following SQL to read and write MySQL to get the ID number.

The advantages and disadvantages of this scheme are as follows:

advantage:

  • It is very simple, realized by using the functions of the existing database system, the cost is low, and it is maintained by DBA professionally.
    The ID number increases monotonically, which can realize some services with special requirements on ID.

shortcoming:

  • It is strongly dependent on the DB. When the DB is abnormal, the entire system is unavailable, which is a fatal problem. Configuring master-slave replication can increase availability as much as possible, but data consistency is difficult to guarantee in special cases. Inconsistencies during master-slave switching may result in repeated numbering.
  • The performance bottleneck of ID issuance is limited to the read and write performance of a single MySQL.

WeChat seqsvr

Regardless of the specific structure of seqsvr, it should be a huge 64-bit array, and each of our WeChat users has an exclusive space of 8 bytes in this large array. This grid is the last one that the user has allocated. sequence: cur_seq. When each user applies for a sequence, he only needs to save the user's cur_seq+=1 into an array and return it to the user.

Figure 1. Xiao Ming applied for a sequence and returned 101

Figure 1. Xiao Ming applied for a sequence and returned 101

Pre-allocated middle layer:

Anything that seems simple will become difficult under the massive traffic. As mentioned above, seqsvr needs to ensure that the allocated sequence increases (reliable data), and it also needs to meet a large number of visits (nearly one trillion visits per day). If the data is reliable, it is easy for us to think of persisting the data to the hard disk, but according to the current access volume of tens of millions per second (~10^7 QPS), basically no hard disk system can handle it.

Background architecture design is often a philosophy of trade-offs, considering whether it is possible to reduce the requirements of certain aspects in exchange for improvements in other aspects for different scenarios. Carefully consider our requirements, we only require increments, and do not require continuity, which means that a large jump is allowed (for example, the sequence sequence allocated: 1,2,3,10,100,101). So we implemented a simple and elegant strategy:

  1. Store the last allocated sequence in memory: cur_seq, and the upper limit of allocation: max_seq
  2. When assigning a sequence, compare cur_seq++ with the upper limit of allocation max_seq: if cur_seq > max_seq, increase the upper limit of allocation by max_seq += step, and persist max_seq
  3. When restarting, read out the persistent max_seq and assign it to cur_seq

Figure 2. Xiaoming, Xiaohong, and Xiaobai each applied for a sequence, but only Xiaobai's max_seq increased the step size by 100

Figure 2. Xiaoming, Xiaohong, and Xiaobai each applied for a sequence, but only Xiaobai's max_seq increased the step size by 100

In this way, by adding an intermediate layer for pre-allocating sequences, the performance of allocating sequences is greatly improved under the premise of ensuring that the sequences do not fall back. In practical applications, the step size of each increase is 10000, so the number of persistent hard disk IOs is reduced from ~10^7 QPS before to ~10^3 QPS, which is within an acceptable range. The sequence allocated during normal operation is sequentially incremented. Only after the machine is restarted, the sequence allocated for the first time will have a relatively large jump, and the size of the jump depends on the size of the step.

Semicolon segment shared storage:

The hard disk IO problem caused by the request is solved, and the service can run smoothly, but there is still a problem in this model: a large amount of max_seq data needs to be read and loaded into the memory when restarting.

We can simply calculate that with the current upper limit of 2^32 uids (user unique IDs) and a space of max_seq 8bytes, the total data size is 32GB, and it takes a lot of time to load from the hard disk. On the other hand, for the sake of data reliability, a reliable storage system is necessary to save the max_seq data, and the data is loaded from the reliable storage system through the network when restarting. If the max_seq data is too large, it will take a lot of time for data transmission when restarting, resulting in a period of unserviceable.

In order to solve this problem, we introduce the concept of section section. A section of users adjacent to uid belongs to a section, and users in the same section share a max_seq, which greatly reduces the size of max_seq data and also reduces IO. frequency.

Figure 3. Xiaoming, Xiaohong, and Xiaobai belong to the same Section, and they share a max_seq.  When everyone applies for a sequence, only Xiaobai breaks through the upper limit of max_seq, and needs to update max_seq and make it persistent

Figure 3. Xiaoming, Xiaohong, and Xiaobai belong to the same Section, and they share a max_seq. When everyone applies for a sequence, only Xiaobai breaks through the upper limit of max_seq, and needs to update max_seq and make it persistent

Currently, a section of seqsvr contains 100,000 uids, and the max_seq data is only 300+KB, which lays the foundation for us to read max_seq data from a reliable storage system and restart.

References

article:

Open source projects of major companies:

The article is continuously updated, you can search for "Desolate Ancient Legend" on WeChat to read it as soon as possible.

Guess you like

Origin blog.csdn.net/finish_dream/article/details/102527388