Interview questions
After the sub-library sub-table, how to deal with id primary key?
Interviewer psychological analysis
In fact, this is after the sub-library sub-table you will be a question and then have to face is how to generate id? Because then, if divided into multiple tables, each table are cumulative from the beginning of 1, it is certainly not ah, you need a globally unique id to support. So this is your real production environment issues must be considered.
Face questions analysis
Database increment id
This means that every time your system to get an id, is to insert a table in a library of a little business meaning of the data, and then get an id increment of a database. Then go down to get the id corresponding to the sub-sub-library table write go.
The advantage of this scheme is convenient and simple, and who will use; drawback is that a single library generation increment id, if the high concurrency, then there will be a bottleneck; if you simply want to improve the look, then and opened a service out of this service every times to get the maximum current id, and then increments id own several one-time return to a group of id, id and then modify the current maximum value to a value after the increment several id; but in any case are based on a single database.
Appropriate scene: you sub-library sub-table on two reasons, either the single database concurrency is too high, or else a single library data is too big; unless you concurrency is not high, but the amount of data is too large to sub-library sub-table expansion, you can use this program, because it may be complicated by the highest per second up to a few hundred, then left alone to generate a library and table auto-increment primary key.
UUID
Benefit is locally generated, based on the database do not come; a bad place is, UUID too long, poor performance as the primary key, the other does not have a UUID orderly, B + tree index can cause excessive at the time of writing random writes, frequently modify the tree structure, resulting in performance degradation.
Appropriate scene: If you are what you want to randomly generated file name, number and the like, you can use UUID, but can not be used as the primary key is the UUID.
Gets the current system time
This is to get the current time, but the problem is that when high concurrency, such as one second concurrent thousands, there will be repeat of the situation, this is certainly not appropriate. Basic would not have considered.
Appropriate scene: If you use this program in general, is the current time with many other business fields spliced together, as a id, if the business you think is acceptable, is also possible. You can level business field values with the current time spliced together to form a globally unique number.
snowflake algorithm
snowflake algorithm is revenue distributed twitter id generation algorithm is the long type id a 64-bit, 1 bit is not used, bit 41 is used therein as a few milliseconds, with a working machine id 10 bit, bit 12 is used as serial number.
-
1 bit: do not, so why then? Because the binary in the first bit is 1 if it is, then it is negative, but we generate id are positive, so the first bit is a 0 uniform.
-
41 bit: it indicates that the timestamp milliseconds. 41 bit numbers can be represented by up to 2 ^ 41--1, i.e. 2 ^ may identify 41--1 millisecond values, expressed in terms of the adult, 69 years.
-
10 bit: recording work machine id, represents the service can be deployed on up to 2 ^ 10 machines which, namely 1024 machine. But in the five bit 10 bit behalf room id, 5 Ge bit on behalf of the machine id. Means that up to 2 ^ 5 Representative room (room 32), and each may represent a room where the machine 2 ^ 5 (32 machine).
-
12 bit: This is used to record different id produced in the same millisecond, the maximum 12 bit may represent a positive integer of 2 ^ 12--1 = 4096, which means that this number can be represented by 12 bit to distinguish one millisecond with 4096 in different id.
0 | 0001100 10100010 10111110 10001001 01011100 00 | 10001 | 1 1001 | 0000 00000000
How do you say, about the meaning of it, that 41 bit is a current timestamp in milliseconds, it is this sense; then 5 bit is that you passed in a room id (but only within maximum 32), the other 5 bit your machine is passed in id (but only within maximum 32), the rest of the 12 bit serial number, that is, if you follow the previous generation are still within a millisecond id of time, then the order will give you accumulate, up to less than 4096 numbers.
So you take advantage of the tools that he engaged in a service, then for each machine in each room are initialized such a thing, the beginning of the serial number of the machine room is 0. Then each time it receives a request, saying that the machine room to generate a id, you will find the corresponding Worker generation.
Using this algorithm snowflake, you can develop your own company's services, even for a machine room and id id, anyway, you set aside 5 bit + 5 bit, you have a business meaning into other things are possible.
The snowflake algorithm is still relatively reliable, so you have to really do distributed id generation, what if it is high concurrency, then this should be a relatively good performance, usually tens of thousands of concurrent scenes per second, enough with you a.
Further reading
Several common sub-library sub-table play and how to solve problems such as cross-database query
Key Problems and Solutions of the level of sub-library sub-table
Class hierarchy of exception handling Spring MVC
Experience Sharing of interview data structures, algorithms title
Vim command, operations, shortcut keys (Favorite Book)
Author: Yang Libin
Source: https: //github.com/doocs/advanced-java