The difference between partitioning and sub-tables in MySQL

The difference between partitioning and sub-tables in MySQL

Insert image description here

1. The difference between partitions and sub-tables

Partitioning and sharding are two technical means for processing large-scale data. Although they both aim to improve system performance and data management efficiency, their implementation methods and application scenarios are slightly different.

1. Partition

Partitioning is to divide a large table into multiple smaller sub-tables, and each sub-table is called a partition. Partitions can be divided based on ranges, lists, or hashes of data, and the data can be distributed across different partitions. Partitioning can improve query performance, reduce index size, improve data reliability, and more.

Partitioning is suitable for processing large amounts of data and frequent queries, especially those scenarios where queries are based on time ranges, such as log tables, transaction tables, etc. In addition, partitioning can simplify data maintenance and backup operations.

2. Sub-table

Table splitting is to divide a large table into multiple independent tables, each table has the same structure. Each table stores part of the data, making query and maintenance more efficient. The sub-table can be divided according to certain rules of the data, such as sub-table according to region, category, etc.

Table sharding is suitable for scenarios where the amount of data is huge and horizontal expansion is required. It can effectively reduce the load on a single table and speed up query operations. However, it should be noted that when using split tables, cross-table queries and data merging operations are required.

The following is a comparison table of the differences between partitions and sub-tables :

Partition sub-table
definition Split a large table into multiple subtables Split a large table into multiple independent tables
data storage Data is stored in different partitions according to rules Data is distributed into different tables according to rules
Data management Manipulate the entire table without considering the details of specific partitions To operate a single table, cross-table query and data merging are required
Query performance Improve query performance by querying only specific partitions Query performance is relatively high, and the size of a single table is small
Index size The index only applies to specific partitions and the index is relatively small The index applies to the entire table, and the index is relatively large
data maintenance Data maintenance is relatively simple and can be backed up and optimized individually Cross-table operations are required and the complexity is high
Applicable scene Large amount of data, frequent queries, and queries based on time range Huge amount of data and horizontal expansion requirements

2. Partition syntax and cases in MySQL

MySQL provides rich partitioning syntax, which can be partitioned according to different partitioning methods. The following uses range division as an example to introduce the partition syntax in MySQL and a specific case:

1. Partition syntax

  • Syntax for creating a partitioned table:
     CREATE TABLE table_name (
         column1 data_type,
         column2 data_type,
         ...
     )
     PARTITION BY RANGE(column_name) (
         PARTITION partition_name1 VALUES LESS THAN (value1),
         PARTITION partition_name2 VALUES LESS THAN (value2),
         ...
     );
  • Build partitions (for already created tables):
	ALTER TABLE table_name
	PARTITION BY RANGE(column_name) (
	    PARTITION partition_name1 VALUES LESS THAN (value1),
	    PARTITION partition_name2 VALUES LESS THAN (value2),
	    ...
	);
  • Partition based on 31 days per month:
	ALTER TABLE table_name
	PARTITION BY RANGE(DAY(created_time)) (
	    PARTITION p1 VALUES LESS THAN (11),
	    PARTITION p2 VALUES LESS THAN (21),
	    PARTITION p3 VALUES LESS THAN (32)
	);
  • Partition based on ID modulo
	ALTER TABLE table_name 
	PARTITION BY HASH(id) PARTITIONS 4;
  • Syntax for adding a partition:
     ALTER TABLE table_name
     ADD PARTITION (
         PARTITION partition_name VALUES LESS THAN (value)
     );
  • Syntax to remove partition:
     ALTER TABLE table_name
     DROP PARTITION partition_name;
  • Remove all partitions syntax:
	ALTER TABLE table_name
	REMOVE PARTITIONING;
  • Verify that the partition was successfully created:
    SHOW CREATE TABLE table_name;

2. Partition case

Suppose there is a salestable named to store sales data. We can partition the table by year.

  • Statement to create partition table:
     CREATE TABLE sales (
         sale_id INT,
         product_name VARCHAR(50),
         sale_date DATE
     )
     PARTITION BY RANGE(YEAR(sale_date)) (
         PARTITION p0 VALUES LESS THAN (2015),
         PARTITION p1 VALUES LESS THAN (2020),
         PARTITION p2 VALUES LESS THAN (MAXVALUE)
     );
  • Statement to add partition:
     ALTER TABLE sales
     ADD PARTITION (
         PARTITION p3 VALUES LESS THAN (2025)
     );
  • Statement to remove partition:
     ALTER TABLE sales
     DROP PARTITION p2;

Through the above partition syntax and partition cases, you can flexibly partition tables to improve database performance and management efficiency.

common problem

  1. A PRIMARY KEY must include all columns in the table's partitioning function
    The reason is: The design requirement of a partitioned table is that the columns used by the partitioning function must be included in the primary key of the table. This is because, when partitioning, MySQL needs to ensure that the data is unique in each partition. By including the columns used by the partitioning function in the primary key, you ensure that the data in each partition has a unique composite key.

Summarize:

Partitioning and sub-tables are common technical means for processing large-scale data in MySQL. Their goals are to improve system performance and data management efficiency. However, partitioning is to divide a large table into multiple sub-tables, while table splitting is to split a large table into multiple independent tables. In specific practice, according to the different attributes and needs of the data, appropriate technical means are selected to segment and manage the data to meet business needs and system performance requirements.

Guess you like

Origin blog.csdn.net/weixin_45626288/article/details/132725349