Small research - design and implementation of MySQL partition table (1)

With the rapid development of information technology, the amount of data is increasing, and massive table query operations consume a lot of time, which has become the main factor affecting the improvement of database access performance. In order to improve the query efficiency and user experience of database operations, an optimized partition table algorithm is proposed in relational database management system (MySQL) through range partition and Merge storage. The experiment proves that the optimized algorithm significantly improves the work efficiency in realizing the table query operation with large amount of data.

Table of contents

1 The purpose of partitioning and table partitioning and the partitioning method of MySQL

1.1 The purpose of partition table

1.2 MySQL partitioning methods and advantages

2 Merge storage engine

2.1 Operation of Merge storage engine

2.2 Advantages of the Merge storage engine

3 Use range partition to implement MySQL partition design

3.1 Creation of employee table and record addition

3.2 Modify the partition statement


With the rapid development of information technology, there are more and more tables in the database, and big data applications are becoming the mainstream of software applications. In the case of no table division
, the overhead of data query and other operations is increasing, and the data volume and data processing capacity that the database can carry will encounter bottlenecks, which will eventually lead to the continuous decline of key indicators such as system response time and throughput. .

MySQL is one of the most widely used relational database management systems. It has the characteristics of high performance, easy deployment, easy use, and convenient data storage. It can handle large databases with tens of millions of records and is widely used in application systems. Use the MySQL table splitting technology to split the large table, and optimize the database performance and improve the query efficiency of the database through fragmentation and logical segmentation. If the table query operation is frequent, the sub-table design will directly affect the application performance of the system and the service quality of the network. In this paper, through range partitioning and Merge storage in the MySQL table, an optimized partitioning table algorithm is proposed to improve the work efficiency of users when querying massive data, so that users can obtain a good experience.

1 The purpose of partitioning and table partitioning and the partitioning method of MySQL

1.1 The purpose of partition table

When the amount of data in a database table is too large, it will face the following problems: data operations will slow down, and operations such as select, join, update, and delete will be performed on the entire table; it is not convenient for storage, and the disk space cannot be stored. The situation of the table. Through data table partitioning, reducing the size of data files and improving disk read and write performance can solve the above problems to a certain extent.

When designing the database for the system, if the data volume of the data table exceeds several million, the time spent on one query will increase; if the query is combined, it may crash. The purpose of sub-table is to reduce the burden on the database and shorten the query time.

1.2 MySQL partitioning methods and advantages

MySQL provides a variety of partition methods, the common ones are:

①range partition, based on a given continuous interval range, the data is allocated to different partitions, such as partitioning according to the product number, when creating a table, you can use the partition by range clause to set the partition method;

②list partition, partition the data according to a discrete list, such as partitioning according to the order status, when creating a table, you can use the partition by list clause to set the partition method;

③hash partition, which evenly distributes data to multiple partitions according to the hash value of the data, which can improve the efficiency of query and load balancing. When creating a table, you can use the partition by hash clause to set the partition method;

④Combined partitioning, combining multiple partitioning methods, such as partitioning according to the date range first, and then partitioning according to the order status, when creating a table, you can use the partition by range/list/hash clause and the partition by subpartition clause to Set the combined partition method.

The main advantages of using MySQL partitions are: compared with a single disk or file system partition, it can store more data; when the partition condition is included in the where clause, you can only scan one or more partitions necessary to improve query efficiency. Queries involving aggregation functions such as sum() and count() can be processed in parallel on each partition, and finally only the results obtained by the partitions need to be summarized; for data that has expired or does not need to be saved, you can delete the data related to these data Relevant partitions to quickly delete data; and achieve greater query throughput by spreading data queries across multiple disks.

2 Merge storage engine

When working with large amounts of temporary data, an in-memory storage engine is required to store all tabular data. In MySQL,
many different storage engines are configured by default in order to flexibly handle various data. The Merge storage engine combines a certain number of myisam tables into a whole, which is very useful for large-scale data storage.

2.1 Operation of Merge storage engine

In the Merge storage engine, the myisam table structure is exactly the same, and the indexes are defined in the same order and in the same way. The Merge table itself does not have any data, and operations such as inserting, updating, deleting, and querying the Merge type table are actually operating on the internal myisam table. Deleting the Merge table only deletes the definition of the Merge table, and has no effect on the internal myisam table.

The insert operation to the Merge type table, defined by the insert_method clause, can have three different values. The first value makes the insert act on the first table; the last value makes the insert act on the last table; if no clause is defined, it means that the Merge table cannot be inserted.

2.2 Advantages of the Merge storage engine

The advantages of the Merge storage engine are mainly reflected in the following aspects: the query speed is more efficient than that of a large table query; multiple data tables can be referenced without issuing multiple queries, and all data can be found by querying the Merge table; it is suitable for storing log data. The data of different months is stored in different tables, using the myisampack tool to compress the data to reduce space, and the Merge table query works normally; it is convenient to maintain and repair a single small table, which is easier than repairing a large data table; the speed of mapping multiple sub-tables to a total table Very fast, the Merge table itself does not store and maintain any indexes, and the indexes are stored and maintained by each associated subtable, so the speed of creating and remapping the Merge table is very fast.

3 Use range partition to implement MySQL partition design

In system design, as the amount of data gradually increases, the efficiency of querying data will decrease. By adopting the range partition algorithm, the speed of data access is accelerated, and the required data can be quickly queried. The range partition table divides the data into different areas according to the values ​​less than operator. It does not need to query the whole table when performing data query, but only needs to query a certain area, which greatly narrows the search range and rapidly improves the query efficiency. Data processing The ability has been further strengthened to meet the bottleneck problem of massive data query. The following uses the employee query as an example to illustrate the implementation of the range partition algorithm.

3.1 Creation of employee table and record addition

首先,创建员工表。
        create table employees_new(id int not null,fname varchar(30),
        lname varchar(30),
        hired date not null default '1973 -01 -01',
        separated date not null default '9999 -12 -31',
        job_code int not null default 0,
        store_id int not null default 0)
        partition by range(store_id)(
        partition p0 values less than (6),
        partition p1 values less than (11),
        partition p2 values less than (16),
        partition p3 values less than (21));

Next, insert 7 records into employees_new.

        insert into employees _ new ( id,fname,lname,hired,store_id) values(1,'张三丰','张','2020 -06 -04',1);
        insert into employees _ new ( id,fname,lname,hired,store_id) values(2,'李思思','李','2019 -07 -01',5);
        insert into employees _ new ( id,fname,lname,hired,store_id) values(3,'王墨海','王','2018 -12 -14',10);
        insert into employees _ new ( id,fname,lname,hired,store_id) values(4,'赵家琪','赵','2021 -06 -06',15);
        insert into employees _ new ( id,fname,lname,hired,store_id) values(5,'田草草','田','2022 -01 -20',20);
        insert into employees _ new ( id,fname,lname,hired,store_id) values(6,'范小宣','范','2023 -03 -06',9);
        insert into employees _ new ( id,fname,lname,hired,store_id) values(7,'刘振国','刘','2022 -03 -20',20);

After adding, the query result is shown in Figure 1.

3.2 Modify the partition statement

According to the range partition scheme, all the rows corresponding to employees with store_id 1 to 5 are stored in partition p0, employees with store_id 6 to 10 are stored in p1, and so on. Note that each partition is defined in order, from lowest to highest. According to the requirement of the partition by range syntax, add a row with store_id > 21, and an error occurs because there is no rule that includes the row with store_id ≥ 21, and the server does not know where to save this record.

To solve store_id > 21, use the valuesless than maxvalue clause when setting up the partition, which provides all values ​​greater than the highest value specified explicitly. maxvalue represents the largest possible integer value. Therefore, by increasing the p4 partition, storing all rows with store_id≥21, and then executing the insert statement, the above problem can be solved. The sql procedure is as follows.

        alter table employees_new add partition(partitionp4 values less than maxvalue);
        insert into employees _ new ( id,fname,lname,hired,store_id) values(8,' 岳晴',' 岳','2023 - 02 -10',25);

Now you can see the employee query result after adding a record, as shown in Figure 2.

3.3 query partition 2 record statement

To query which records are in partition 2, the sql statement is as follows.

        select * from employees _ new where store _ idbetween 6 and 10;

The query result is shown in Figure 3.

Guess you like

Origin blog.csdn.net/Dream_Weave/article/details/132134558