How to reasonably generate the primary key for the database: UUID, snowflake algorithm

Table of contents

1. Disadvantages of using auto-increment primary key

2. Primary key generation algorithm

2.1.UUID

2.1.1. Overview

2.1.2. UUID in JAVA

2.2. Snowflake Algorithm

2.2.1. Overview

2.2.2. Using Snowflake Algorithm in JAVA


1. Disadvantages of using auto-increment primary key

First of all, in actual projects, we rarely use auto-increment primary keys such as 1, 2, 3... for the following reasons:

  • primary key conflict
  • performance problem
  • Security Question

Primary key conflict:

For example, if I want to synchronize data across databases, or across "partitions" in a distributed system, it is not difficult to imagine, 1, 2, 3... This incremental single number is extremely prone to conflicts .

Performance issues:

The auto-increment primary key in the database uses an auto-increment sequence to generate the primary key value, and in a high concurrency situation, multiple threads may request the next auto-increment value at the same time, and competition will occur at this time, because the database needs to ensure that each auto-increment Boosts will only be assigned once. In order to ensure the uniqueness of the self-increment value, the database uses a lock mechanism to prevent multiple threads from obtaining the same self-increment value at the same time.

Security Question:

Using an auto-increment primary key makes it easy to guess the primary key value of the next record, which may cause some security issues.

To sum up, we can find that it is necessary to use complex primary keys. To generate complex and unique primary keys, we need to rely on the primary key generation algorithm.

2. Primary key generation algorithm

Common primary key generation algorithms:

  • UUID
  • snowflake algorithm

2.1.UUID

2.1.1. Overview

UUID, Universally Unique Identifier, the core idea of ​​the UUID algorithm is to generate a 128-bit unique identifier, usually expressed as a string of 32 hexadecimal digits. Each UUID will try to ensure its uniqueness.

There are currently several official versions of the UUID algorithm, jointly formulated by the Internet Engineering Task Force (IETF), the International Organization for Standardization (ISO) and the International Telecommunication Union (ITU):

  1. UUIDv1: UUID is generated based on timestamp and MAC address, which ensures that the UUID generated on the same computer is unique, but it is not suitable for use in distributed systems.

  2. UUIDv2: UUID based on DCE security mechanism, using POSIX UID/GID and current time to generate UUID, which is not commonly used.

  3. UUIDv3: Generate UUID based on namespace and string, and use MD5 hash algorithm to generate UUID.

  4. UUIDv4: Generates UUIDs using random numbers, guaranteed to be unique across all computers.

  5. UUIDv5: Similar to UUIDv3, but uses the SHA-1 hashing algorithm to generate the UUID.

In fact, UUID is just a concept in essence. We can even write a UUID algorithm by ourselves, but our own writing will definitely not be so high-quality, and the effect of ensuring uniqueness will not be good.

2.1.2. UUID in JAVA

In engineering, we generally generate UUID at the backend of JAVA as the primary key of each piece of data to be inserted into the database. Generating UUID in JAVA is very simple:

import java.util.UUID;

public class UUIDExample {
    public static void main(String[] args) {
        // 生成一个随机的UUID
        UUID uuid = UUID.randomUUID();
        System.out.println("Random UUID: " + uuid.toString());
    }
}

The generation process is actually very simple:

public static UUID randomUUID() {
    SecureRandom ng = Holder.numberGenerator;

    byte[] randomBytes = new byte[16];
    ng.nextBytes(randomBytes);
    randomBytes[6]  &= 0x0f;  /* clear version        */
    randomBytes[6]  |= 0x40;  /* set to version 4     */
    randomBytes[8]  &= 0x3f;  /* clear variant        */
    randomBytes[8]  |= 0x80;  /* set to IETF variant  */
    return new UUID(randomBytes);
}
  1. First Holderget an SecureRandomobject from the static inner class. HolderThe class is a lazy-loaded singleton class, which is used to ensure that SecureRandomthe object is only created once and is thread-safe.

  2. Then generate a byte array of length 16 randomBytesand ng.nextBytes(randomBytes)fill it with random numbers using the method.

  3. Next, some specific bits in the byte array randomBytesare modified to conform to the format requirements of UUID version 4:

    • Clear the upper 4 bits of the 7th byte, and set its lower 4 bits to 4, indicating that this is UUID version 4.
    • Clear the upper 2 bits of the 9th byte, and then set its 7th bit to 1, indicating that this is a UUID that complies with the IETF standard.
  4. Finally, the modified byte array is used as a parameter, and UUID(byte[] data)the constructor is called to generate a UUID object.

2.2. Snowflake Algorithm

2.2.1. Overview

Snowflake Algorithm (Snowflake Algorithm) is a distributed ID generation algorithm developed by Twitter, which is used to generate a 64-bit unique ID. Its core idea is: use a 64-bit long number as a globally unique ID, the high part represents the timestamp, the middle part represents the machine ID, and the low part represents the serial number on this machine.

The 64-bit long number of the snowflake algorithm consists of the following parts:

  • Sign bit (1 bit): Since the long type number is signed and the ID generated by the snowflake algorithm must be a positive number, the sign bit is fixed at 0.
  • Timestamp part (41 bit): record the timestamp, accurate to the millisecond level, you can use the current time to subtract a fixed start time to get a relative timestamp, so you can use 41-bit binary numbers to represent what the timestamp can support The time frame is about 69 years.
  • Machine ID part (10 bits): Multiple machines can be configured, and each machine is assigned a unique ID, which is used to prevent ID duplication in a distributed environment.
  • Serial number part (12 bit): Indicates the serial number generated within the same millisecond, which can be realized by auto-increment, and supports up to 4096 IDs.

2.2.2. Using Snowflake Algorithm in JAVA

The native JDK does not provide the implementation of the snowflake algorithm. Some commonly used ORM frameworks support the function of using the snowflake algorithm to generate the primary key ID, such as hibernate and mybatis-plus.

Take Hibernate as an example:

The @GenericGenerator annotation and the @GeneratedValue annotation can be used on the primary key field of the entity class, where the @GenericGenerator annotation is used to specify the name and type of the primary key generator, and the @GeneratedValue annotation is used to specify the primary key generation strategy and the name of the generator, for example:

@Entity
@Table(name = "user")
public class User {
 
    @Id
    @GenericGenerator(name = "snowflake", strategy = "com.xxx.snowflake.SnowflakeIdGenerator")
    @GeneratedValue(generator = "snowflake")
    @Column(name = "id")
    private Long id;
 
    // ...
}

Take mybatis-plus as an example:

Configure the primary key generator as the generator of the snowflake algorithm

mybatis-plus:
  global-config:
    db-config:
      id-type: ASSIGN_ID
      key-generator: com.baomidou.mybatisplus.core.incrementer.SnowflakeKeyGenerator

Then assign the master to use the configured primary key generator

import com.baomidou.mybatisplus.annotation.IdType;
import com.baomidou.mybatisplus.annotation.TableId;
import lombok.Data;

@Data
public class User {

    @TableId(type = IdType.ASSIGN_ID)
    private Long id;

    private String name;

    private Integer age;

    // 其他字段省略...
}

Guess you like

Origin blog.csdn.net/Joker_ZJN/article/details/130448650