mysql tuning four-partition table

mysql tuning four-partition table


Preface

        This article describes how to use partition tables to optimize mysql.


1. The principle of partition table


        1. Partition table composition: The
        partition table is implemented by multiple related underlying tables, and each partition can be accessed directly. The storage engine manages the partitions of each underlying table and manages ordinary tables (all the underlying tables must use the same storage engine), the index of the partition table is just to add an identical index to each of the underlying tables.

        2. Partition table operation logic:
        SELECT: When
        querying the partition table, the partition layer opens and locks all the underlying tables. The optimizer first judges whether it can filter some partitions, and then calls the corresponding storage engine interface to access the data of each partition.
        INSERT: When
        writing a record, the partition layer opens and locks all underlying tables, first determine which partition receives the record, and then write the record to the corresponding underlying table.
        DELETE: When
        deleting records, the partition layer opens and locks all the underlying tables, first determine the partition corresponding to the data, and then delete the corresponding underlying tables.
        UPDATE: When
        updating records, the partition layer opens and locks all underlying tables. First, determine which partition the record needs to be updated is in, fetch and update the data, then determine which partition the updated data should be in, and finally write to the underlying table , And delete the underlying table where the source data is located.

        Note: For related operations, if the where condition matches the partition expression (that is, the partition condition), then all partitions that do not contain the record can be filtered out, without any other partition operations.
        If the storage engine is InnoDB, it will release the corresponding table lock at the partition layer and turn it into a row lock (because InnoDB can implement row-level locks).

Two, the partition table type

1. Range partition

        Assign corresponding rows to corresponding partitions within a given range based on column values. Each partition contains row data and the expression of the partition is within a given range. The range of the partitions should be continuous and cannot overlap. You can use the values ​​less than operator to define.

CREATE TABLE useremp (
    id INT NOT NULL,
    name VARCHAR(30),
    creat_date DATE NOT NULL DEFAULT '1970-01-01',
    code INT NOT NULL,
    user_id INT NOT NULL
)
-- 根据user_id创建四个分区,maxvalue表示始终大于等于最大可能整数值的整数值
PARTITION BY RANGE (user_id) (
    PARTITION p0 VALUES LESS THAN (6),
    PARTITION p1 VALUES LESS THAN (11),
    PARTITION p2 VALUES LESS THAN (16),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);
-- 根据creat_date创建分区(精确年)
PARTITION BY RANGE ( YEAR(create_date) ) (
    PARTITION p0 VALUES LESS THAN (1991),
    PARTITION p1 VALUES LESS THAN (1996),
    PARTITION p2 VALUES LESS THAN (2001),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);
-- 根据creat_date创建分区(精确天)
PARTITION BY RANGE COLUMNS(create_date) (
    PARTITION p0 VALUES LESS THAN ('1960-01-01'),
    PARTITION p1 VALUES LESS THAN ('1970-01-01'),
    PARTITION p2 VALUES LESS THAN ('1980-01-01'),
    PARTITION p3 VALUES LESS THAN ('1990-01-01'),
    PARTITION p4 VALUES LESS THAN MAXVALUE
);

2. List partition

        Similar to range partitioning, list partitioning is based on a column value matching a value in a discrete value set to select partitioning.

PARTITION BY LIST(user_id) (
    PARTITION p1 VALUES IN (3,5,6,9,17),
    PARTITION p2 VALUES IN (1,2,10,11,19,20),
    PARTITION p3 VALUES IN (4,12,13,14,18),
    PARTITION p4 VALUES IN (7,8,15,16)
);

3. Column partition

        Starting from 5.5, column partitioning is supported, which can be considered as an upgraded version of range partitioning and list partitioning. Column partitioning only accepts ordinary columns but not expressions.

 CREATE TABLE test_list (
 col1 int(11) DEFAULT NULL,
 col2 int(11) DEFAULT NULL,
 col3 char(20) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY RANGE COLUMNS(col1,col3)
(PARTITION p0 VALUES LESS THAN (10,'aaa') ENGINE = InnoDB,
 PARTITION p1 VALUES LESS THAN (20,'bbb') ENGINE = InnoDB);

4. Hash partition

        The selected partition is based on the return value of a user-defined expression, which is calculated using the column values ​​of these rows to be inserted into the table.

CREATE TABLE useremp2 (
    id INT NOT NULL,
    name VARCHAR(30),
    code INT NOT NULL,
    creat_date DATE NOT NULL DEFAULT '1970-01-01',
    user_id INT NOT NULL
)
-- 对user_id取hash值%4产生四个分区
PARTITION BY HASH(user_id)
PARTITIONS 4;
-- 对creat_date的年份取hash值%4产生四个分区
PARTITION BY LINEAR HASH(YEAR(creat_date))
PARTITIONS 4;

5. Key partition

        Similar to hash partitioning, the difference is that key partitioning supports one or more columns, and mysql server provides its own hash function, and one or more columns must contain integer values.

6. Subpartition

        On the basis of partitioning, storage after partitioning.

CREATE TABLE test_table
(
  id INT AUTO_INCREMENT,
  name VARCHAR(10) NOT NULL,
  status INT(2) NOT NULL
  PRIMARY KEY (id, status)
)  ENGINE = INNODB
PARTITION BY RANGE(id)
SUBPARTITION BY HASH(status) SUBPARTITIONS 2
(
PARTITION p0 VALUES LESS THAN(5),
PARTITION p1 VALUES LESS THAN(10),
PARTITION p2 VALUES LESS THAN(15)
);

Three, partition table restrictions

        1. A table can only have 1024 partitions at most, and 8196 partitions can be supported after version 5.7.
        2. In the lower version, the partition expression must be an integer or an expression that returns an integer. After 5.5, you can directly use the column to partition.
        3. If there are primary key or unique index columns in the partition field, all primary key columns and unique index columns must be included. This problem can be solved by using a composite primary key or changing the primary key index to a normal index.
        4. Partitioned tables cannot use foreign key constraints.

Fourth, the application scenarios and advantages of the partition table

        Advantages:
        1. The data of the partition table can be distributed on different physical devices, thereby efficiently using multiple hardware devices.
        2. The data of the partition table is easier to maintain: To delete a large amount of data in batches, you can use the method of clearing the entire partition to optimize, check, and repair an independent partition.
        3. Independent partition can be backed up and restored.
        4. Partition tables can be used to avoid certain bottlenecks: InnoDB's single index mutually exclusive access


        application:
        1. The table is so large that it cannot be all stored in memory, or there are hot data only in the last part of the table, and all others It is historical data.
        When the amount of data in the table is huge, it is certainly not possible to scan the entire table every time you query. Because the index consumes space and maintenance, it is best not to use the index. Even if the index is used, a large amount of fragmentation will be generated, and a large amount of random IO will be generated. When the amount of data is huge, the index cannot be obviously effective. When you can use the partition table.
        Use a simple partition method to store the table, without any indexes, roughly locate the required data according to the partitioning rules, and limit the required data to a few partitions by using the where condition. This method is suitable for accessing large amounts of data in a normal way.
        If the data has hotspots, except for the hotspot data, other data is rarely accessed. You can put this part of the hotspot data in a separate partition, so that the data in this partition can have the opportunity to be cached in the memory, so that the query can only access a small The partition table can use indexes and caches effectively.

Five points to note

        1. The null value will invalidate the partition filter, and the field of the partition condition needs to ensure that there is no null value.
        2. The partition column and index column do not match, which will cause the query to fail to perform partition filtering.
        3. The cost of opening and locking all underlying tables may be high, and the cost of maintaining partitions needs to be considered.

to sum up

        Use partition tables in appropriate scenarios, determine the columns used for partitioning according to specific business scenarios, and do not use partition tables when it is not necessary because of maintenance costs and other issues.

Guess you like

Origin blog.csdn.net/weixin_49442658/article/details/112576004