Distributed unique ID generator

In application, often you need a globally unique ID as a database key. How to generate globally unique ID?

First, you need to determine the globally unique ID is an integer or a string? If a string, then the existing UUID to fully meet the demand, no additional work. The disadvantage is a large space as an ID string, integer index lower than efficiency.

If used as an integer ID, then the first 32-bit type int excluded because the range is too small, it is necessary to use 64-bit long pattern.

Using integer as ID, how to build self-add, globally unique non-repeating ID?

Scheme 1: Use of the self-energizing ID database, starting from 1, can be done substantially continuously increasing. Oracle can use SEQUENCE, MySQL primary key can be used AUTO_INCREMENT, although it can not guarantee globally unique, but each table unique, basically meet the demand.

Shortcoming database auto-increment ID is inserted before the data can not obtain ID. After inserting the data, although obtained ID is unique, but we must wait until after the transaction commits, ID is considered valid. Some two-way data cited, had to insert before making an update, too much trouble.

The second way is to use a centralized generator ID, which may be the Redis, may be the ZooKeeper, may record in the database using the table ID assigned last.

The biggest drawback of this approach is too complex, heavily dependent on third-party services, and code configuration cumbersome. In general, the more complex the program, the less reliable, and more painful test.

The third way is Snowflake Twitter-like algorithm that assigns a unique identifier to each machine, and then by timestamp increment identifier + + implemented globally unique ID. In this way advantage in that generation algorithm ID is a completely non-state machine, no network calls, efficient and reliable. The disadvantage is that if there are duplicate unique identification, ID will cause a conflict.

Snowflake millisecond timestamp algorithm uses 41bit, 10bit plus machine ID, the serial number plus 12bit, theoretically supports up to 1024 machine serial number 4096000 to generate second, Twitter For enough scale.

But for most ordinary applications, you do not need more than 4 million per ID, the number of machines but also reach to 1024, so that we can improve the look, use a shorter way to generate ID:

A second stage 53bitID ​​32bit + 16bit timestamp increment + 5bit machine identification composed accumulation machine 32, can be generated per second serial number 65000, the core code:

 1 private static synchronized long nextId(long epochSecond) {
 2     if (epochSecond < lastEpoch) {
 3         // warning: clock is turn back:
 4         logger.warn("clock is back: " + epochSecond + " from previous:" + lastEpoch);
 5         epochSecond = lastEpoch;
 6     }
 7     if (lastEpoch != epochSecond) {
 8         lastEpoch = epochSecond;
 9         reset();
10     }
11     offset++;
12     long next = offset & MAX_NEXT;
13     if (next == 0) {
14         logger.warn("maximum id reached in 1 second in epoch: " + epochSecond);
15         return nextId(epochSecond + 1);
16     }
17     return generateId(epochSecond, next, SHARD_ID);
18 }

Timestamp minus a fixed value, this program supports up to 2106.

65,000 per second if the serial number is not enough how to do? It does not matter, you can continue to increment the timestamp forward to "borrow" the next second serial number 65000.

It also solves the problem of call-back time.

Machine identification program using a simple host name, as long as the host name in line host-1, host-2it can automatically extract the machine identification, no configuration.

Finally, why the use of up to 53 integer, rather than 64-bit integer? This is considering that most of the application is a Web application, and JavaScript if you want to deal with, since the maximum integer JavaScript support is 53 more than the median, JavaScript will lose precision. Therefore, the use of 53 integers can be read directly by the JavaScript, while more than 53, it must be converted to a string JavaScript in order to ensure proper treatment, this will bring additional API interface complexity. This is why Weibo API interface will also return idand idstrreason.

 

Reprinted Liao teacher blog: https: //www.liaoxuefeng.com/article/1280526512029729

Guess you like

Origin www.cnblogs.com/yeqfa/p/11576759.html