The evolution of database architecture

I have read many articles on the evolution of company architecture recently, and found that the basic ideas and architecture evolution are very similar. Here I also summarize the evolution of database architecture and the ideas behind the evolution.

single host

In the beginning, the website is generally evolved from the typical LAMP architecture. Generally, it is a linux host, an apache server, a php execution environment and a mysql server. Generally, these are all on a virtual host, abbreviated as Single host mode.

Disadvantages of single host mode:

1 The web server and the mysql server share the same host and share hardware resources. There may be too much resource requisition by one party, resulting in bottlenecks in the entire application.

2 When the business grows, there is no way to scale horizontally.

3 Fault tolerance is too poor, once there is a problem with the host, the entire application is unavailable

Dedicated host

With the development of the business, the mysql server and the web server host can be separated and deployed separately, which is the independent host mode.

In the independent host mode, the web server and MySQL no longer share hardware resources and are deployed separately. Not putting all your eggs in one basket increases fault tolerance. If only the mysql server fails, applications on the web that do not access the server will not be affected. Moreover, the web server can be scaled horizontally. If the performance of the web server is insufficient, multiple web servers can be added to perform load balancing and disperse the pressure on the web server.

Disadvantages of Dedicated Hosting Mode:

1 Scalability problem: Although the web server can scale horizontally, the mysql server cannot scale horizontally.

2 Availability problem: There is a single point of problem with the mysql server. Once the mysql server goes down, it will have a great impact on the impact.

3 Performance issues: The services that a single mysql server can support is limited.

read-write separation

With the continuous development of the business, the pressure on the database will increase, and a single database will gradually fail to meet the demand. Some websites do not require high real-time data, and will gradually develop a read-write separation mode. For ordinary Query requests are allocated to the read database (or standby database), and modification requests are completed on the main database. For the read library, because it is stateless, it can be scaled horizontally. For writing libraries, it can only be a single host

In fact, this model has limitations, which should be considered according to the type of business. The data in the main database is up-to-date, but there will be a delay in synchronizing to the reading database, so the application must be able to tolerate short-term inconsistencies. It is not suitable for scenarios with very high consistency requirements.

Problems with this pattern:

1 Scalability: Although the reading library can scale horizontally, the writing library is not enough, and the reading library cannot scale horizontally.

2 Availability: The read library becomes a single point. Once a failure occurs, all write operations will be affected.

Business vertical split

With the development of the business, one writing library obviously cannot meet the high concurrency situation, but considering that the writing library is stateful and cannot be simply scaled horizontally, if there are two writing libraries, then update the data of one randomly. It will cause problems with the other party's data. It is obviously unacceptable that there are two different versions of one kind of data. In terms of writing the library, you can consider vertical sub-library according to the business. Since we are talking about database architecture here, for the web layer, it can actually be split vertically according to business.

After vertical splitting according to the business, the performance of the system has been greatly improved. It is only necessary to divide the business into vertical parts. The finer the division, the stronger the overall scalability of the system.

In this mode, there are the following problems

1 Availability: Assume that the database accessed by a complete business process P is split into five libraries, A, B, C, D, and E. Assuming that the availability of each write library is 99%, then the availability of this business process P is 99%. %*99%*99%*99%*99%=95%, the more libraries are split, the greater the challenge to the overall availability of the system.

2 Performance: Since the load of each vertical business library may be different, assuming that the transaction library has a high load, one transaction writing library must not be able to meet the demand. In this case, the transaction library becomes the bottleneck of the entire system.

3 Scalability: The scalability of a single node is not improved, and the transaction library cannot be scaled individually.

Horizontal and vertical splitting of single business library

In the previous case, it is assumed that the transaction library is the bottleneck of the whole system, and a separate extension to the transaction library is required. The horizontal or vertical split of the transaction can be considered, and it is possible to split the two methods at the same time.

Horizontal splitting is generally split according to business-related keywords, and the horizontal scalability is better, but the challenge for querying is relatively large.

Vertical splitting is generally split according to business, but it may lead to uneven data and inflexible splitting. Relatively friendly to query

Taking the transaction database as an example, you can first perform vertical sub-repository for business by transaction type, and then perform horizontal sub-repository according to the order number.

Assuming that it can be divided into M*N banks, the failure of a single bank will affect 1/M*N transactions, but assuming that the availability of each bank is 99%, then the transaction database failure probability is (99%) (M+N) To the power, the more the database is split, the higher the probability of a single database failure.

Problems with this approach:

1 Although a single node failure affects few users, overall availability is reduced.

2 The database management brings complex challenges. Assuming that the transaction database table structure is changed, M×N script changes need to be executed.

3 Due to the high probability of a single database failure, the dba will be very hard, and it is estimated that firefighting is often required

4 It will be very hard to develop and test, the cost of development and testing will become high, and the query will be very complex.

5 If a single node fails, there is no failure detection and switching mechanism

6 The sub-library cannot be extended infinitely in the horizontal direction. Our algorithm is to allocate M libraries in advance. It is basically infeasible to add a library.

Random sub-library

For the sixth question, for wireless expansion in the horizontal direction, a mechanism can be considered. When inserting data, apply for a database number, and then save the database number as a field or add this number to the existing field.

For example, suppose we apply for the insert database and get a database number of 1000, then we can construct an order number of 1000_tradeno, with the sub-database number in front of the order number and the actual tradeno behind the order number, which solves the problem of horizontal wireless expansion. This is the random database mode. However, this method has great limitations.

Disadvantages of random sub-library:

1 The sub-database algorithm and business are coupled together, which is more suitable for specific scenarios and has a narrow scope of application

2 For the insert operation, it is relatively easy. For the update operation, there must be a sub-database number, that is to say, it can only be updated according to specific fields.

3 It is not suitable for batch query scenarios, and the query function is relatively limited, which is also a problem caused by sub-database

Single database backup and failover

For a single database, if a failure occurs, it will affect the business, but whether it can be switched in the event of a failure. Although it can be achieved, there will be certain problems, and specific analysis needs to be performed in specific scenarios. This piece is more complicated, and it can be written in an article to briefly introduce it.

The above is the summary of the evolution of the database architecture. The evolution of the database requires the support of many basic technologies, including:

1 Powerful distributed database management middleware, which mainly shields the underlying database routing and data management functions

2 A powerful data operation and maintenance team and monitoring system can detect the database status of each node

3 A strong database management team can maintain such a database cluster

4 Strong business architecture capabilities and technical architecture capabilities, able to control such complex business scenarios.

转载自：http://www.cnblogs.com/aigongsi/archive/2012/11/23/2784773.html