Introduction to Snowflake Algorithm Tools

insert image description here

Introduction

Snowflake (SnowFlake) algorithm is a distributed unique ID generation algorithm that can generate globally unique ID identifiers, just like snowflakes in nature do not have the same snowflakes. Its core idea is to divide a 64-bit integer into 4 parts, namely:

1-bit identifier: the highest bit, always 0, used to distinguish positive and negative numbers.
41-bit timestamp: indicates the timestamp of ID generation, accurate to the millisecond level, and can be used for 69 years.
10-digit data center ID: indicates the number of the data center, which can support 1024 data centers.
12-digit machine ID: indicates the number of the machine, which can support 4096 machines.

The IDs generated by snowflake are generally sorted according to time increment, and ID collisions will not occur in the entire distributed system (distinguished by datacenter and workerId), and the efficiency is high.

In the same millisecond, different machines or data centers can generate different serial numbers, which ensures the uniqueness of the generated IDs. In addition, since the timestamp occupies the high-order part of the 64-bit integer, the generated ID is getting bigger and bigger, which can meet the needs of some scenarios that need to be sorted in chronological order.

The advantages of the snowflake algorithm are:

Simple and easy to implement: It mainly depends on the three parameters of timestamp, data center ID and machine ID, and is relatively simple to implement.
Uniqueness: The generated ID is globally unique, which can meet the needs of distributed systems.
Time-ordered: The generated IDs are sequentially increased according to time, which can meet the needs of some scenarios that need to be sorted in time order, and are stored in the database, with high indexing efficiency. .
Scalability: The number of data center IDs and machine IDs can be increased as needed to support more data centers and machines.
High performance and high availability: It does not depend on the database when generated, and is completely generated in memory.
Large capacity: millions of self-incrementing IDs can be generated every second.

However, the snowflake algorithm also has some disadvantages:

Reliance on system clock: Duplicate IDs may be generated if the system clock is dialed back.
Data center IDs and machine IDs need to be assigned manually: Data center IDs and machine IDs need to be manually configured, which is not easy to manage.
The number of machines is limited: 12-digit machine IDs can only support 4096 machines. If you need to support more machines, you need to increase the number of machine IDs.

To sum up, the snowflake algorithm is a distributed ID generation algorithm that is simple and easy to implement, unique and time-ordered, and is suitable for the generation of unique ID identifiers in distributed systems.

Java implementation

public class Snowflake {
    
    
    // 开始时间戳,一般为项目启动时间
    private final long twepoch = 1288834974657L;
    // 机器ID所占的位数
    private final long workerIdBits = 5L;
    // 数据标识ID所占的位数
    private final long datacenterIdBits = 5L;
    // 支持的最大机器ID,结果是31
    private final long maxWorkerId = ~(-1L << workerIdBits);
    // 支持的最大数据标识ID,结果是31
    private final long maxDatacenterId = ~(-1L << datacenterIdBits);
    // 序列号所占的位数
    private final long sequenceBits = 12L;
    // 机器ID向左移12位
    private final long workerIdShift = sequenceBits;
    // 数据标识ID向左移17位(12+5)
    private final long datacenterIdShift = sequenceBits + workerIdBits;
    // 时间戳向左移22位(5+5+12)
    private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
    // 生成序列的掩码,这里是4095
    private final long sequenceMask = ~(-1L << sequenceBits);

    private long workerId; // 机器ID
    private long datacenterId; // 数据标识ID
    private long sequence = 0L; // 序列号
    private long lastTimestamp = -1L; // 上次生成ID的时间戳

    public Snowflake(long workerId, long datacenterId) {
    
    
        if (workerId > maxWorkerId || workerId < 0) {
    
    
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
    
    
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        this.workerId = workerId;
        this.datacenterId = datacenterId;
    }

    public synchronized long nextId() {
    
    
        long timestamp = timeGen();

        if (timestamp < lastTimestamp) {
    
    
            throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
        }

        if (lastTimestamp == timestamp) {
    
    
            sequence = (sequence + 1) & sequenceMask; // 序列号自增
            if (sequence == 0) {
    
     // 序列号超过最大值,则等待下一个时间戳
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
    
    
            sequence = 0L; // 序列号重置为0
        }

        lastTimestamp = timestamp;

        return ((timestamp - twepoch) << timestampLeftShift) | (datacenterId << datacenterIdShift) | (workerId << workerIdShift) | sequence;
    }

    private long tilNextMillis(long lastTimestamp) {
    
    
        long timestamp = timeGen();
        while (timestamp <= lastTimestamp) {
    
    
            timestamp = timeGen();
        }
        return timestamp;
    }

    private long timeGen() {
    
    
        return System.currentTimeMillis();
    }

    public static void main(String[] args) {
    
    
        Snowflake idWorker = new Snowflake(0, 0);
        for (int i = 0; i < 1000; i++) {
    
    
            long id = idWorker.nextId();
            System.out.println(Long.toBinaryString(id));
            System.out.println(id);
        }
    }
}

Java related tools

The Java version of Snowflake has many open source libraries. The following are some commonly used open source libraries and how to use them:

  1. Twitter's snowflake: It is the earliest Java implementation of the snowflake algorithm, which supports high concurrency, low latency, and high availability. How to use:

    Snowflake snowflake = new Snowflake(workerId, datacenterId);
    long id = snowflake.nextId();
    
  2. Baidu's UidGenerator: It is improved based on Twitter's snowflake algorithm, and supports high performance, high availability, and high concurrency. How to use:

    UidGenerator uidGenerator = UidGenerator.getUidGenerator();
    long id = uidGenerator.getUID();
    
  3. Meituan's Leaf: It is a high-performance, lightweight distributed ID generator that supports multiple ID generation algorithms, including the Snowflake algorithm. How to use:

    SegmentIDGenImpl idGen = new SegmentIDGenImpl();
    idGen.init();
    long id = idGen.getId();
    
  4. Alibaba's nacos: is a lightweight service registration and discovery tool, which includes a Java implementation of the snowflake algorithm. How to use:

    SnowFlake snowFlake = new SnowFlake(dataCenterId, machineId);
    long id = snowFlake.nextId();
    
  5. yitter

  6. hutool tool class

    Snowflake snowflake = IdUtil.getSnowflake(1, 1);
    long id = snowflake.nextId();
    

    insert image description here

    Stratus cloud for thousands of miles, thousands of mountains and snow at dusk, who does the shadow go to?

Guess you like

Origin blog.csdn.net/qq_35764295/article/details/130768031