Auto-increment or UUID, how to choose the type of database primary key?

1. Advantages and disadvantages of auto_increment and UUID

Advantages of auto_increment:
1. The field length is much smaller than uuid, and can be bigint or even int type, which will affect the performance of retrieval.
2. In terms of writing, because it is auto-increasing, the primary key tends to increase automatically, which means that the newly added data will always be behind, which greatly improves performance. .


3. The database is automatically numbered, fast, and grows incrementally. It is stored in order, which is very beneficial for retrieval.
4. Numeric type, takes up little space, is easy to sort, and is convenient to transfer in the program.

Disadvantages of auto_increment:
1. Because it is auto-increment, it is easy to know the business volume of the current system through web crawlers.
2. In the case of high concurrency, competition for self-increasing locks will reduce the throughput of the database.
3. In the scenario of data migration or sub-database and sub-table, the auto-increment method is no longer applicable.

Advantages of UUID:
1. No conflict. When data is split and merged for storage, the global uniqueness of the primary key can be guaranteed.
2. Can be generated at the application layer to improve database throughput.

Disadvantages of UUID:
1. Affects the insertion speed and causes low hard disk usage. Compared with auto-increment, the biggest flaw is random IO (New records may be inserted into the middle of previous records, so the previous records need to be moved)< /span>


2. The string type definitely consumes more space than the integer type, and the operation is slower than the integer type.

2. Specific choices

The choice of database primary key type should be determined based on specific needs. The following is a comparison of the two primary key types: auto-increment and UUID:

  1. Auto-increment primary key: Each time a new record is inserted, the database automatically assigns a unique auto-increment value to the record. Auto-incrementing primary keys usually use integer types, such as INT or BIGINT. The advantages of auto-incrementing primary keys are simplicity, efficiency, fast insertion speed, and high query efficiency. It is suitable for most situations, especially in scenarios where new records need to be inserted frequently.

  2. UUID primary key: UUID (Universally Unique Identifier) ​​is a globally unique identifier. It is a 128-bit number, usually represented as a string. The advantage of UUID primary key is that it can maintain uniqueness in a distributed system and there will be no conflicts between different databases. It is suitable for scenarios where data needs to be synchronized between multiple databases or where data identification needs to be decoupled from business logic.

When choosing between auto-increment primary key or UUID primary key, you should consider the following factors:

  • Database performance: Auto-increment primary keys are often more efficient than UUID primary keys and are more suitable for large-scale data sets.
  • Data replication and merging: If data needs to be merged or synchronized from different databases, using UUID primary keys can avoid conflicts.
  • Security and privacy: If you need to protect the security or privacy of data, you can consider using UUID primary keys, because auto-incrementing primary keys may expose the insertion order and quantity information of data.

       Because uuid has no rules compared to the sequential auto-incrementing id, the value of the new row is not necessarily greater than the value of the previous primary key, so InnoDB cannot always insert new rows into the index In the end, it is necessary to find a new suitable location for the new row to allocate new space. This process requires a lot of extra operations, and the disordered data distribution will lead to the following problems:
①. The target page written is likely to have been flushed to the disk and Removed from the cache, or not yet loaded into the cache, InnoDB has to find and read the target page from disk into memory before inserting it, which results in a lot of random IO.
②. Because writing is out of order, innodb has to perform page splitting operations frequently in order to allocate space for new rows. Page splitting results in moving a large amount of data, and at least one insertion requires modification. More than three pages.

③. Due to frequent page splits, pages will become sparse and filled irregularly, eventually leading to data fragmentation.
4. After loading the random values ​​(uuid and snowflake id) into the clustered index (InnoDB’s default index type), sometimes you need to do it once
OPTIMEIZE TABLE to rebuild the table and optimize page filling, which will take some time.


        Conclusion: When using InnoDB, you should insert as much as possible in the order of increasing primary keys, and try to use monotonically increasing clustering key values ​​to insert new rows. If it is a scenario of sharding databases and tables, the distributed primary key ID generation scheme gives priority to the snowflake algorithm to generate a globally unique primary key (the primary keys generated by the snowflake algorithm are ordered to a certain extent). Therefore, for most cases, using an auto-incrementing primary key is a more common and reasonable choice. But if it needs to be used in a distributed system, or there are specific security or privacy requirements, you can consider using UUID primary keys.

Guess you like

Origin blog.csdn.net/weixin_49171365/article/details/133198610