Is it better to use self-increment (INT) or UUID for the primary key of the database?

When I attended the Changsha Internet Conference before, I asked the WeChat DBA a question, is it better to use self-growth or UUID for the primary key of the database? DBA replied: Self-growth is good, because self-growth has the characteristics of small footprint and fast indexing. But does it have to be the best? I just encountered the same tangled problem when I was working on a project recently. In fact, I have checked a lot of information and asked a lot of people about this problem. Finally, I will summarize it for your reference only.

In fact, for the use of self-growth or UUID, the most discussed is speed and storage space. Here I have added security and distribution. The specific comparison is as follows:

    The advantages of using self-growth as the primary key:
    1. Small data storage space
    2. The best performance
    3. It is easy to remember
    the disadvantages of using self-growth as the primary key:
    1. If there is a large amount of data, it may exceed the value range of the self-growth.
    2. It is difficult (but not impossible) to deal with distributed storage data tables, especially It is the case where tables need to be merged.
    3. Low security, because it is regular and easy to obtain data illegally
    . Advantages of using GUID as the primary key:
    1. It is unique and has less chance of repetition.
    2. It is suitable for large amounts of data. Insert and update operations, especially in high concurrency and distributed environments
    3. Cross-server data merging is very convenient
    4. High security
    Disadvantages of using GUID as the primary key:
    1. Large storage space (16 bytes), so it will Occupies more disk space
    2, will reduce performance
    3, it is difficult to remember

the practical application of each project, how should we choose? as follows:

    1. When the project is a stand-alone version, and the amount of data is relatively large (millions), the self-growth is used. At this time, it is best to consider security and take some security measures.
    2. The project is a stand-alone version, and the amount of data is not that large. When the speed and storage requirements are not high, use UUID.
    3. The project is distributed, so UUID is preferred. Distributed generally does not require high speed and storage.
    4. The project is distributed, and when the amount of data reaches tens of millions, it can be higher. When there are requirements for speed and storage, self-growth can be used.

Now someone will ask, why not just use the fourth solution, which meets all the requirements of the previous three solutions? It is because the distributed processing will be very complicated when using self-growth (the specific scheme can be Baidu), and the first three simple implementation schemes can be used in the case of limited resources.

So, why must use auto-increment or UUID as database primary key? Is there a better solution? There must be: snowflake

In fact, most projects can use twitter's snowflake to generate the primary key. The primary key generated by snowflake is a kind of primary key between self-growth and UUID (small storage space, fast speed, distributed, time series). It is said that snowflake can generate 260,000 IDs per second and can be deployed on up to 1024 nodes. The R&D team can use snowflake as the underlying database primary key tool for team members to use.



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326071768&siteId=291194637