Several sub-library sub-table face questions of

Sub-library sub-table is high concurrency an important point of high availability systems, the Internet company for an interview is often asked.

Why sub-library sub-table (high concurrent design the system, how to design the database level)?

First, to be clear, and sub-sub-library table are two different things, are two separate concepts. And sub-library sub-table is to prevent the database service because traffic the same time (additions and deletions to change search) results in excessive downtime and design of a coping strategy.

Why did you library

As a general experience, a single library supports up to 2,000 concurrent volume, and preferably is maintained in 1000. If the amount of 20,000 concurrent demand, requiring expansion, you can split the data into a database of multiple libraries, the time of the visit based on certain conditions, access to a single repository to ease the pressure performance of a single library.

Why did you watch

Sub-table is the same, if the amount of data a single table is too large, it will affect the execution performance of SQL statements. Sub-table is according to a certain policy split the data into a single table of multiple tables, the query time is also according to a certain strategy to query the table corresponding to the range of data so that a query will be narrowed. Such sub-table according to user id, to a user in a data table, find the table id CRUD first operation performed by a user on it. This put the amount of data in each table is controlled within a certain range to enhance the performance of SQL statements executed.

What are the middleware sub-library sub-table is used? Different sub-library sub-table middleware has what advantages and disadvantages?

Sub-library sub-table common middleware has: cobar, TDDL, atlas, sharding-jdbc mycat and so on.

cobar

cobar is b2b team development and open source Ali, belonging to the proxy layer program, between the application server and database server. JDBC driver to access the application through cobar cluster, cobar points based on SQL and library rules do break down for SQL, and then distributed to different MySQL Cluster database instance execution. cobar separation does not support read, stored procedures, and cross-database join operations such as paging. You can also use the early years, but have not been updated in recent years, people with basically nothing, be eliminated.

TDDL

TDDL Taobao team development, belongs to the client layer scheme. It supports basic grammar and crud separate read and write, but does not support join, multi-table query syntax. Currently not much use, since also need to rely on Taobao diamond configuration management systems.

atlas

atlas 360 is open, the program belongs proxy layer. There are some companies used to be then, but the latest community to maintain all five years ago, and now with the company basically gone.

sharding-jdbc

sharding-jdbc Dangdang is open, the layer belonging to client programs. This middleware more support for SQL syntax, not too restrictive. Version 2.0 also began to support the sub-library sub-table, separate read and write, id distributed generation, flexible Affairs (best effort service type transactions, TCC affairs). Currently the community has also been developed and maintained, be more active, you can now also choose a program.

mycat

mycat is based cobar transformation, belong proxy layer program. It supports function is perfect, is a database middleware currently very fire. Community is very active, constantly updated. Compared to sharding-jdbc, the younger, less experienced temper too.

to sum up

In summary, it is recommended to consider sharding-jdbc and mycat use.

Sharding-jdbc advantage of this is that the client layer without the deployment, operation and maintenance costs will therefore relatively low. At the same time because the secondary layer does not need to forward the request agent, a high performance. But if you encounter upgrade, you need to upgrade various systems are re-re-release version, because the systems require coupling sharding-jdbc-dependent.

Mycat drawback of this approach is that you need to deploy proxy, so the operation and maintenance cost would be higher. But the advantage that it is transparent for each project (decoupling), if you want to upgrade, then only need to deal with middleware on the line.

Generally speaking, these two programs can all be selected. It is recommended that small and medium sized companies to use sharding-jdbc better, because the client layer scheme lightweight, low maintenance cost; the proposed large-scale companies to use mycat better, because the proxy layer program can cope with heavy use of multiple systems and projects, although maintenance costs relative to He says will be higher, but medium and large companies still lack this human right.

Specifically, how the database is split vertically or horizontally split?

The concept of split level

Split level meaning, is to split the data into a table of a plurality of tables to which a plurality of libraries. Table structure for each library there are the same, but the data is not stored in the same table, data for each database table is a summary of all the data together. Significance level that will split evenly data table stored in the respective libraries, libraries rely on multiple concurrent bar higher, but also the storage capacity by means of a plurality of libraries for the expansion.

Vertical, split

Means split vertically, that is, a lot of the table to be split into a plurality of fields or a plurality of database tables to the above, the structure of each database table is different, each library table contains some of the fields. In general, the fewer will visit a high frequency field into a table inside, then more access to low frequency fields into another table inside. Because the database is cached, high frequency line field you access the less, it can cache more lines, the better the performance in the cache. This is generally done more some of the tables level.

Horizontal and vertical split split scenes

The so-called split-level table, the table is divided. Specifically table is split into a table of N, so that the amount of data in each table is controlled within a certain range to ensure performance of SQL. Otherwise, the larger the amount of data a single table, SQL performance is also worse, usually around 200 million lines, not too much. If your SQL more complex, we try to make the number of rows in a single table the less.

Whether or sub-library sub-table, all major database middleware can support. These middleware sub-library after you sub-table, the specified value is automatically routed to a field corresponding to the above table and the corresponding library. Then you just consider how the project sub-library sub-table on the line. In general, the vertical resolution may be done in a table level, i.e. particularly in some fields of the table to do something split; split levels, it may be complicated because the load carrying capacity can not or can not, be it by a field to be distributed to different database tables inside.

Two sub-library program points table

Here to talk about two kinds of sub-programs library sub-table and their advantages and disadvantages.

1. according to points range. For example, according to time division sub-library tables, each database table data are stored in a continuous time range. However, this approach is rarely used because it is easy to produce hot issues, they are playing a lot of traffic on the latest data. The advantage of this approach is that when the expansion is very simple, as long as such are ready to prepare a monthly library on it, next to the new month automatically writes data to the new library. The disadvantage is that, if the majority of requests are access to the latest data, here, designed sub-library sub-table is just a simple expansion, rather than to deal with the high concurrency.

2. distribution in accordance with the hash. Hash value of a field in accordance with the uniform dispersion, the more commonly used. Advantageous in that the average amount of data allocated for each database table and requests pressure; expansion troublesome drawback, because there will be a data migration process, i.e., before the data need to recalculate the hash value and re-assigned to a different database tables.

 

"A person happiest moment is to find the right person .TA will accommodate your shortcomings and love you all."

Guess you like

Origin www.cnblogs.com/yanggb/p/11214339.html