You get to know a text with MySQL in the district!

作者: GrimMjx
https://www.cnblogs.com/GrimMjx/p/10526821.html

A logical storage configuration .InnoDB

First we have to introduce the concept of logical storage structure and InnoDB area, all its data is logically stored in a table space, and table space by the segment, district pages.

segment

Segment is a segment on the map area, the common section data segment, index segment, rollback, etc., in the engine InnoDB storage, administration of the segments is completed by the engine itself.

Area

District extent is the area on the map, the area is a space consisting of a continuous page, no matter how the size of the page increases, the size of the default zone is always 1MB.

To ensure the continuity of pages in region, from a disk storage engine InnoDB application zones 4-5, the size of 16KB InnoDB default page, i.e., a region, a total of 64 (1MB / 16kb = 16) consecutive pages.

Each segment begins, first with 32 fragments page (page) size to store data, the continuous application is 64 pages in the page after they have been used. The purpose of this is, for some small tables or undo segment class, can begin to apply for a smaller space, saving disk overhead.

page

Page is the area on the map page, it can also be called a block. InnoDB page is the smallest unit of disk management. Default 16KB, you can be set by the parameter innodb_page_size size.

Common page types are: data page, undo page, a page, transaction data page, insert a bitmap page buffer, the buffer free list page insert, uncompressed binary large object page, compressed binary large object pages, etc.

II. Overview of Partitioning

Partition

Speaking of partition, this "zone" nor that "zone", where talk of partition is meant to assign records in different rows of the same table to a different physical file, several partitions have several .idb file, not what we just said area.

MySQL adds support for horizontal partitioning at 5.1. Partition table or index is decomposed into a plurality of smaller, more manageable portions.

Are independent of each zone can be processed independently or as part of a larger object for processing. This is the MySQL support functions, business code without changes.

To know MySQL is OLTP-oriented data, unlike TIDB other DB. So for the use of the partition should be very careful, if you do not know how to use a negative impact on the performance of the partition might be.

MySQL database partitions are local index partition, a partition of the existing data, and put the index. In other words, the clustered index and non-clustered index for each area are placed in the respective regions (different physical files). Currently MySQL database does not support the global partition.

No matter what type of partition, if the primary key or unique index exists in the table, partition column must be part of a unique index.  

III. Partition type

MySQL currently supports the following types of partitions, RANGE partitioning, LIST partitions, HASH partition, KEY partitions.

If the table primary key or unique index exists, partition column must be part of a unique index. RANGE combat all likelihood with partitions.

RANGE partition

RANGE actual partition is the most commonly used type of partition, the partition is placed in the line data based on column values ​​belonging to a given continuous interval.

But remember, when you insert data partition is not a defined value when will throw an exception.

RANGE partition is mainly used for date column partition, such as transaction table ah, ah so the sales table. The date data can be stored.

If you partition a unique index to go in the date data type, so pay attention, the optimizer can only YEAR(), TO_DAYS(), TO_SECONDS(), UNIX_TIMESTAMP()these functions were optimized. Int type can be used in actual combat, so only keep yyyyMM enough. I do not care about the function.

CREATE TABLE \`m\_test\_db\`.\`Order\` (  
  \`id\` INT NOT NULL AUTO_INCREMENT,  
  \`partition_key\` INT NOT NULL,  
  \`amt\` DECIMAL(5) NULL,  
  PRIMARY KEY (\`id\`, \`partition_key\`)) PARTITION BY RANGE(partition_key) PARTITIONS 5( PARTITION part0 VALUES LESS THAN (201901),  PARTITION part1 VALUES LESS THAN (201902),  PARTITION part2 VALUES LESS THAN (201903),  PARTITION part3 VALUES LESS THAN (201904),  PARTITION part4 VALUES LESS THAN (201905)) ;  

This time we first insert some data

INSERT INTO \`m\_test\_db\`.\`Order\` (\`id\`, \`partition_key\`, \`amt\`) VALUES ('1', '201901', '1000');  
INSERT INTO \`m\_test\_db\`.\`Order\` (\`id\`, \`partition_key\`, \`amt\`) VALUES ('2', '201902', '800');  
INSERT INTO \`m\_test\_db\`.\`Order\` (\`id\`, \`partition_key\`, \`amt\`) VALUES ('3', '201903', '1200');  

Now we inquire about, found by EXPLAIN PARTITION command SQL optimizer only searches the corresponding area, does not search all partitions

If there are problems sql statement, we will go all areas. It will be very dangerous. So after the partition table, select statement must go partitioning key.

The following three is not too common, it is glossed over.

LIST partitions

RANGE and LIST partitions partition is very similar, but the value of the partitioning column is discrete, not continuous. LIST partition using VALUES IN, because the value of each partition is discrete, so only defined values.

HASH Partitioning

Speaking of the hash, then it is clear that the object, a uniform distribution data to the respective predefined partitions, each partition to ensure that substantially the same number.

KEY partitions

KEY HASH partition and the partition is similar, except that the HASH partition using a user-defined partition function, using the partition KEY functions provided by the database partition.

IV. Partition and performance

A technology, not necessarily with the benefits. For example, explicit locking powerful than the built-in lock, you do not play well could lead to a very bad situation.

Partition is the same, not to start faster, the partition may give some performance improvements partitioned database sql statement will run, but the zoning is mainly used for high-availability management database. Database applications are divided into two types, one is the OLTP (online transaction processing), one is OLAP (online analytical processing).

For OLAP application partitions can be very good indeed improve query performance, because general analysis of large amounts of data need to return, if by the time partition, such as user behavior data a month, you only need to scan the partition response. In OLTP applications, more partitions to be careful, do not usually get 10% of a large table of data, mostly through the index returned a few data can be.

For example, a table 1000w amount of data, if a secondary index select statement go, but did not go partitioning key. Then the result will be very embarrassing. If the B + tree 1000w height is 3, there are 10 partitions. So not to (3 + 3) * 10 logic IO? (3 times a clustered index, three times a secondary index, 10 partitions). So please be careful in OLTP applications use the partition table.

In the daily development, if you want to view the partitions sql statement query results can be used to explain partitions + select sql get, partitions which identifies walked several partitions.

mysql> explain partitions select * from TxnList where startTime>'2016-08-25 00:00:00' and startTime<'2016-08-25 23:59:00';    
+----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+-------------+    
| id | select_type | table             | partitions | type | possible_keys | key  | key_len | ref  | rows  | Extra       |    
+----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+-------------+    
|  1 | SIMPLE      | ClientActionTrack | p20160825  | ALL  | NULL          | NULL | NULL    | NULL | 33868 | Using where |    
+----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+-------------+    
row in set (0.00 sec)  

Reference: "MySQL Technology Insider"

I recommended to my blog to read more:

1. the Java the JVM, collections, multithreading, new series of tutorials

2. the Spring MVC, the Boot the Spring, the Spring series of tutorials Cloud

3. Maven, Git, the Eclipse, IDEA Intellij Tool Tutorial Series

4. the Java, the back-end architecture, Alibaba and other manufacturers face new questions

Life is beautiful, see tomorrow ~

Guess you like

Origin www.cnblogs.com/javastack/p/12657728.html