Database - Capacity (transfer)

Typical database architecture design and practice

1).  Single library architecture

2).  Grouping Architecture

     What is grouping?

     A : The grouping architecture is the most common one-master-multiple-slave, master-slave synchronization, read-write separation database architecture:

  • user-service: still a user center service

  • user-db-M (master): the main library, providing database writing services

  • user-db-S (slave): slave library, providing database read service

     The master and slave database clusters are called "groups".

     What are the characteristics of the group?

     A : Database clusters in the same group:

  • Data synchronization between master and slave through binlog

  • Multiple instance database structures are exactly the same

  • The data stored by multiple instances is also the same, essentially copying the data

     What exactly does the grouping architecture solve?

     A : Most Internet services read more and write less, and database reading often becomes the first performance bottleneck . If you want to:

           Linearly improve database read performance

  • Improve database write performance by eliminating read-write lock conflicts

  • "Read High Availability" of Data Through Redundant Slave Libraries

      At this time, the grouping architecture can be used. It should be noted that in the grouping architecture, the main database of the database is still a single point of writing.

      To sum up in one sentence, grouping solves the problem of "high database read and write concurrency and high" , and the implemented architecture design.

 3).  Sharding Architecture

      What is sharding?

      A : The sharding architecture is the horizontal sharding database architecture that everyone often says:

  • user-service: still a user center service

  • user-db1: horizontally split into the first of 2 parts

  • user-db2: horizontally split into the second of 2

      After sharding, multiple database instances will also form a database cluster.

      Horizontal segmentation, is it a sub-library or a sub-table?

      A : It is strongly recommended to sub-library, rather than sub-table, because:

  • The sub-table still shares a database file, and there is still competition for disk IO

  • The sub-database can easily migrate data to different database instances, or even database machines, with better scalability

      Horizontal segmentation, what algorithm is used?

      Answer : Common horizontal segmentation algorithms include "range method" and "hash method":

      The scope method is as shown in the figure above: Based on the business primary key uid of the user center, the data is horizontally divided into two database instances:

  • user-db1: stores uid data from 0 to 10 million

  • user-db2: stores uid data from 0 to 20 million

      The hash method is as shown in the figure above: it is also based on the business primary key uid of the user center, and the data is horizontally divided into two database instances:

  • user-db1: store uid data whose uid is modulo 1

  • user-db2: Stores uid data whose uid is modulo 0

      Both of these methods are used on the Internet, among which hashing is more widely used.

      What are the characteristics of sharding?

      A : A database cluster in the same shard:

  • There is no direct connection between multiple instances, unlike the binlog synchronization between master and slave

  • Multiple instance database structures are also exactly the same

  • There is no intersection between data stored by multiple instances, and the union of data between all instances constitutes global data

      What exactly does the sharding architecture solve?

      Answer : Most Internet services have a large amount of data, and the capacity of a single database can easily become a bottleneck. At this time, sharding can:

  • Linearly improves database write performance. It should be noted that the grouping architecture cannot linearly improve database write performance.

  • Reduce the data capacity of a single database

      In one sentence, sharding solves the problem of "large amount of database data" and implements the architectural design.

  4).  Grouping + Fragmentation Architecture

     If the business read and write concurrency is high and the amount of data is also large , it is usually necessary to implement a database architecture of grouping + sharding:

  • Reduce the data volume of a single database through sharding, and linearly improve the write performance of the database

  • Linearly improve the read performance of the database by grouping and ensure the high availability of the read library

 5).  Vertical segmentation

      In addition to horizontal segmentation, vertical segmentation is also a common type of database architecture design, and vertical segmentation is generally closely integrated with business.

      Taking the user center as an example, vertical segmentation can be done like this:

      User(uid, uname, passwd, sex, age, …)

      User_EX(uid, intro, sign, …)

  • Vertically split table, primary key is uid

  • Login, password, gender, age and other attributes are placed in a vertical table (library)

  • Self-introduction, personal signature and other attributes are placed in another vertical table (library)

      How to do vertical segmentation?

      A : When vertically segmenting data according to business, two factors, "length" and "access frequency" of attributes, are generally considered:

  • Shorter length and higher access frequency are put together

  • Longer lengths and less frequent visits are put together

      This is because the database will load the data into the memory (buffer) in units of rows. In the case of limited memory capacity, the memory can load more data for attributes with short length and high access frequency. The hit rate will be higher, the disk IO will be reduced, and the performance of the database will be improved.

      What are the characteristics of vertical segmentation?

      Answer : There are similarities between vertical and horizontal cuts, but they are not the same:

  • There is no direct connection between multiple instances, that is, there is no binlog synchronization

  • Multiple instance database structures are different

  • There is at least one intersection between the data stored by multiple instances, which is generally the primary key of the business. The union of data among all instances constitutes global data

      What problem does vertical slicing solve?

      A : Vertical segmentation can reduce the amount of data in a single database, and can also reduce disk IO to improve throughput, but it is closely integrated with services, and not all services can be vertically segmented.

 6)  .Summary

       The article is long, I hope to remember at least the following points:

  • Single library for initial business use

  • Read pressure is high, read high availability, use grouping

  • Large amount of data, write linear expansion, use sharding

  • Attributes with short attributes and attributes with high access frequency are vertically split together

 

 

 

The content is transferred from the public account: the road of architects

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324609278&siteId=291194637