- demand analysis
- Too large an amount of user requests -> distributed servers (dispersion requests to multiple servers)
- Too much on a single database server where a single library insufficient disk space; limited processing capacity; IO bottlenecks occur
- Single table is too large -> CRUD is a problem, swelling index, query timeout
- effect
- Maximum capacity together form a complete set of data, stored in the expansion single, read and write speed limit
- Each server node is called fragment
- The advantages of high throughput
- The higher the throughput, the higher the reading and writing of data is completed at the same time the amount of
- A node may be a single day throughput can only reach 1TB (subject to hardware limitations, limited hard drive speed)
- The higher the throughput, the higher the reading and writing of data is completed at the same time the amount of
Vertical Split
- Vertical Table
- According to field a table split into multiple tables
- Vertical library
- Splitting a plurality of database tables to a plurality of databases (nodes)
- Item Processing
- Vertical user data table user_basic user_profile
- Vertical article data table (longer and content of the article is needed only in the details page) article_basic article_content
- If the subsequent libraries Vertical
- The association needs to have a table in the same database, such as user database on 1 related articles on related database 2
Split Horizontal
- The level of sub-table
- The 10 million records into two tables
- Table manner as time division / id / geographical / hash sub-table modulo
- After the level of sub-library sub-table level sub-table, the sub-table nodes dispersed on multiple databases
- Distributed ID
- Requirements: After the level of sub-table, multi-table id need to ensure that the conflict does not appear
- solution
- UUID Universally Unique Identifier disadvantages: longer, the trend is not incremented (if the primary key is not increasing, the efficiency will be relatively low index)
- Database primary key increment
- 1 embodiment separate database is only responsible for generating the primary key disadvantages: Once down, global paralysis
- Scheme 2 set increment step all tables use the same disadvantages steps: fragmentation rules can not be changed, can not be extended
- Redis
- incr ( "user_id") Returns the value of the increment
- Resources will not snatch a problem, because redis is single-threaded, can guarantee atomicity
- Shortcoming
- Downtime redis
- redis Easy Data Loss
- Snow algorithm -Snowflake
Twitter algorithm, and an object is to generate a 64-bit integer
- Disadvantages: Time to call back, the machine may cause deviation time, though synchronized correct, but could be wrong when generating
- If the call-back (current time <recording time) occurs, the algorithm will automatically throw an exception, allowing users to wait a while
- Cancel ntp time synchronization
- Project application
- Late amount of data a user id id id articles comments may be significant
- The amount of data and the number of requests came early, do not do slices