Why is uuid not recommended as the primary key of Mysql?

Why is it not recommended to use uuid as the primary key of the Mysql database?

​ When designing tables in mysql, mysql officially does not recommend the use of uuid or non-continuous non-repeated snowflake id, but recommends the use of continuous self-incrementing primary key id.

​ The official recommendation is auto_increment. The main reason is that uuid's efficiency has plummeted when the amount of data is large.

Comparison of index structure using uuid and self-incrementing id

Internal structure using self-incrementing id

​ The value of the incremented primary key is sequential, so Innodb stores every record behind a record. When the maximum fill factor of the page is reached (innodb is 15/16)

​ The next record will be written to a new page. Once the data is loaded in this order, the primary key page will be filled up with nearly sequential records, which improves the maximum fill rate of the page and no page waste

​ The newly inserted row must be in the next row of the original largest data row, mysql positioning and addressing quickly, will not make additional consumption for calculating the position of the new row

​ Reduced page split and fragmentation

Disadvantage
  • It is easy for others to crawl the database based on the self-increasing id to analyze the data
  • For high concurrency loads, InnoDB will cause obvious lock contention when inserting by the primary key, and the upper bound of the primary key will become a hot spot for contention. Because all inserts happen here, concurrent inserts will cause gap lock competition
  • @Auto_Increment lock mechanism will cause the grab of self-increment lock, there is a certain performance loss

Index internal structure using uuid

​ Because uuid has no rules at all, the size of the new value and the old value cannot be determined, so innodb is required to find a new suitable location for the new row to allocate new space. Will cause the following problems:

  • The written target page may have been flushed to the disk and removed from the cache, or has not been loaded into the cache. Innodb has to find and read the target page from the disk to the memory before inserting it, which leads to A lot of random IO
  • Because writes are out of order, InnoDB has to frequently do page splitting operations to allocate space for new rows. Page splits cause a large amount of data to be moved. At least three pages are modified at a time.
  • Due to frequent page splits, the pages will become sparse and filled irregularly, and eventually the data will be fragmented.

After loading random values ​​(uuid and snowflake id) into a clustered index (innodb's default index type), sometimes it is necessary to do an OPTIMEIZE TABLE to rebuild the table and optimize the filling of the page, which will take some time.

to sum up:

​ Use innodb to insert as much as possible in the order of the increment of the primary key, and use monotonically increasing cluster key values ​​to insert new rows as much as possible

Guess you like

Origin blog.csdn.net/issunmingzhi/article/details/108596275