Analysis of the realization principle of database horizontal segmentation - sub-library, sub-table

Chapter 1 Introduction
With the widespread popularity of Internet applications, the storage and access of massive data has become a bottleneck in system design. For a large-scale Internet application, billions of PVs per day undoubtedly cause quite high load on the database. It has caused great problems to the stability and scalability of the system. To improve website performance through data segmentation, horizontal scaling of the data layer has become the preferred method for architecture developers.
  • Horizontally slicing database: It can reduce the load of a single machine and minimize the loss caused by downtime
  • Load balancing strategy: It can reduce the access load of a single machine and reduce the possibility of downtime
  • Cluster solution: solves the problem of single-point database inaccessibility caused by database downtime
  • Read-write separation strategy: maximizes the speed and concurrency of reading data in applications

Chapter 2 Basic Principles and Concepts
What is Data
Sharding? The word "Shard" in English means "shard", and as a technical term related to database, it seems to be first seen in MMORPGs. "Sharding" is called "sharding". Sharding is not a function attached to a specific database software, but an abstract process on top of specific technical details. It is a solution for horizontal expansion (Scale Out, or horizontal expansion and outward expansion). The I/O capability limitation of a single-node database server solves the problem of database scalability. The data is horizontally distributed to different DBs or tables through a series of segmentation rules, and the specific DBs or tables that need to be queried are found through the corresponding DB routing or table routing rules to perform Query operations. "Sharding" usually means "horizontal slicing", which is the focus of this article. Next, let's take a simple example: we explain the log in a Blog application. For example, the log article (article) table has the following fields:


Faced with such a table, how do we segment it? How to distribute such data to tables in different databases? We can do this, put all the article information with user_id 1~10000 into the article table in DB1, put all the article information with user_id 10001~20000 into the article table in DB2, and so on, until DBn. In this way, the article data is naturally divided into various databases to achieve the purpose of data segmentation.

The next problem to be solved is how to find the specific database? In fact, the problem is also simple and obvious. Since we use the distinguishing field user_id when sub-database, it is natural that the process of database routing is of course indispensable to user_id. That is, when we know the user_id of this blog, we use this user_id, use the rules of sub-database, and in turn locate the specific database. For example, if user_id is 234, using the rule just now, it should locate DB1. If user_id is 12343, using this rule, it should locate DB2. By analogy, using the rules of sub-database, reverse routing to a specific DB, we call this process " DB routing ".

Usually, we will consciously design our database according to the paradigm. Considering the DB design of data segmentation, it will violate the usual rules and constraints. In order to split, we have to have redundant fields in the database tables, which are used as distinguishing fields or tag fields called sub-databases. For example, the field of user_id in the example of the article above (of course, the example just now does not reflect the redundancy of user_id well, because the field of user_id will appear even if it is not divided into databases, so we picked it up cheap). Of course, the appearance of redundant fields does not only appear in the scenario of sub-database. In many large-scale applications, redundancy is also necessary. This involves the design of efficient DBs, and this article will not repeat them.

Reference: http://www.cnblogs.com/zhongxinWang/p/4262650.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326486604&siteId=291194637