Use snowflake id and uuid or auto-increment id as MySQL primary key

When designing tables in MySQL, MySQL officially recommends not to use uuid or discontinuous and non-repeated snowflake ids (long and unique), but to recommend continuous auto-incrementing primary key ids. The official recommendation is auto_increment, so why not recommend uuid , what is the harm in using uuid?

MySQL and program example
1. To explain this problem, we first create three tables,
namely user_auto_key, user_uuid, and user_random_key, which respectively represent the auto-growth primary key, uuid as the primary key, and random key as the primary key, and we keep the others completely unchanged. According to the control variable method, we only use different strategies to generate the primary key of each table, while other fields are exactly the same, and then test the insertion speed and query speed of the table:

Note: The random key here actually refers to the non-continuous, non-repetitive and irregular id calculated by the snowflake algorithm: a string of 18-bit long values

id automatic generation table:
insert image description here
user uuid table

random primary key table:

2. Use spring's jdbcTemplate to implement query increase test
Technical framework:

springboot+jdbcTemplate+junit+hutool，

The principle of the program is to connect to your own test database, and then write the same amount of data in the same environment to analyze the insertion time to comprehensively evaluate its efficiency. In order to achieve the most realistic effect, all data is randomly generated , such as name, email, and address are randomly generated

package com.wyq.mysqldemo;
import cn.hutool.core.collection.CollectionUtil;
import com.wyq.mysqldemo.databaseobject.UserKeyAuto;
import com.wyq.mysqldemo.databaseobject.UserKeyRandom;
import com.wyq.mysqldemo.databaseobject.UserKeyUUID;
import com.wyq.mysqldemo.diffkeytest.AutoKeyTableService;
import com.wyq.mysqldemo.diffkeytest.RandomKeyTableService;
import com.wyq.mysqldemo.diffkeytest.UUIDKeyTableService;
import com.wyq.mysqldemo.util.JdbcTemplateService;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.util.StopWatch;
import java.util.List;
@SpringBootTest
class MysqlDemoApplicationTests {
    
    

    @Autowired
    private JdbcTemplateService jdbcTemplateService;

    @Autowired
    private AutoKeyTableService autoKeyTableService;

    @Autowired
    private UUIDKeyTableService uuidKeyTableService;

    @Autowired
    private RandomKeyTableService randomKeyTableService;


    @Test
    void testDBTime() {
    
    

        StopWatch stopwatch = new StopWatch("执行sql时间消耗");


        /**
         * auto_increment key任务
         */
        final String insertSql = "INSERT INTO user_key_auto(user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?)";

        List<UserKeyAuto> insertData = autoKeyTableService.getInsertData();
        stopwatch.start("自动生成key表任务开始");
        long start1 = System.currentTimeMillis();
        if (CollectionUtil.isNotEmpty(insertData)) {
    
    
            boolean insertResult = jdbcTemplateService.insert(insertSql, insertData, false);
            System.out.println(insertResult);
        }
        long end1 = System.currentTimeMillis();
        System.out.println("auto key消耗的时间:" + (end1 - start1));

        stopwatch.stop();


        /**
         * uudID的key
         */
        final String insertSql2 = "INSERT INTO user_uuid(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?)";

        List<UserKeyUUID> insertData2 = uuidKeyTableService.getInsertData();
        stopwatch.start("UUID的key表任务开始");
        long begin = System.currentTimeMillis();
        if (CollectionUtil.isNotEmpty(insertData)) {
    
    
            boolean insertResult = jdbcTemplateService.insert(insertSql2, insertData2, true);
            System.out.println(insertResult);
        }
        long over = System.currentTimeMillis();
        System.out.println("UUID key消耗的时间:" + (over - begin));

        stopwatch.stop();


        /**
         * 随机的long值key
         */
        final String insertSql3 = "INSERT INTO user_random_key(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?)";
        List<UserKeyRandom> insertData3 = randomKeyTableService.getInsertData();
        stopwatch.start("随机的long值key表任务开始");
        Long start = System.currentTimeMillis();
        if (CollectionUtil.isNotEmpty(insertData)) {
    
    
            boolean insertResult = jdbcTemplateService.insert(insertSql3, insertData3, true);
            System.out.println(insertResult);
        }
        Long end = System.currentTimeMillis();
        System.out.println("随机key任务消耗时间:" + (end - start));
        stopwatch.stop();


        String result = stopwatch.prettyPrint();
        System.out.println(result);
    }
}

3. Program writing result
user_key_auto writing result:
insert image description here
user_random_key writing result:

user_uuid table writing result:

4. Efficiency test result

insert image description here
When the existing data volume is 130W: let’s test inserting 10w data again to see what the result will be:

insert image description here
It can be seen that when the amount of data is about 100W, the insertion efficiency of uuid is at the bottom, and 130W of data is added in the subsequent order, and the time of uuid plummets again. The efficiency ranking of the overall time usage is: auto_key>random_key>uuid, uuid has the lowest efficiency, and the efficiency plummets in the case of a large amount of data. So why is there such a phenomenon?

Comparison of index structures using uuid and auto-increment id
1. Internal structure using auto-increment id
insert image description here
The values of auto-increment primary keys are sequential, so Innodb stores each record behind a record. When the maximum fill factor of the page is reached (the default maximum fill factor of innodb is 15/16 of the page size, 1/16 of the space will be reserved for future modification):

①The next record will be written into a new page. Once the data is loaded in this order, the primary key page will be filled with nearly sequential records, which increases the maximum fill rate of the page and will not waste pages.

②The newly inserted row must be the next row of the original largest data row, mysql locates and addresses quickly, and will not make additional consumption for calculating the position of the new row

③ Reduced page splitting and fragmentation

2. Use the index internal structure of uuid

insert image description here
Because uuid is irregular relative to the sequence of self-incrementing ids, the value of the new row is not necessarily greater than the value of the previous primary key, so innodb cannot always insert the new row at the end of the index, Instead, it needs to find a new suitable location for the new line to allocate new space. This process requires a lot of additional operations. The randomness of data will lead to scattered data distribution, which will lead to the following problems:

①The written target page is likely to have been flushed to the disk and removed from the cache, or has not been loaded into the cache. InnoDB has to find and read the target page from the disk into memory before inserting, which will causing a lot of random IO

②Because the writes are out of order, innodb has to do frequent page splitting operations to allocate space for new rows. Page splitting causes a large amount of data to be moved, and at least three pages need to be modified for one insertion

③Due to frequent page splits, pages will become sparse and filled irregularly, eventually resulting in data fragmentation

After loading random values (uuid and snowflake id) into the clustered index (the default index type of innodb), sometimes it is necessary to do an OPTIMEIZE TABLE to rebuild the table and optimize the filling of the page, which will take a certain amount of time. .

Conclusion: using innodb should try to insert in the order of primary key increments, and use monotonically increasing clustering key values to insert new rows as much as possible

3. Disadvantages of using auto-increment id

So is there no harm in using self-increasing ids? No, self-incrementing id will also have the following problems:

①Once someone else crawls your database, they can obtain your business growth information based on the self-incrementing id of the database, and it is easy to analyze your business situation

②For high concurrent load, InnoDB will cause obvious lock contention when inserting according to the primary key, and the upper bound of the primary key will become a hot spot for contention, because all insertions occur here, and concurrent insertion will cause gap lock competition

③The Auto_Increment lock mechanism will cause self-increasing lock snatching, which has a certain performance loss

Attached:

The lock contention problem of Auto_increment, if you want to improve, you need to tune the configuration of innodb_autoinc_lock_mode

Use snowflake id and uuid or auto-increment id as MySQL primary key

Guess you like