mysql part tables in the partition table

Original link: http://www.cnblogs.com/mjorcen/p/3976829.html

mysql part tables in the partition table

Transfer from: http: //blog.51yip.com/mysql/949.html

 

First, what is the mysql points table, partition

  What is a sub-table, viewed from the face value, that is the table into a plurality of N small tables, see the specific three methods mysql part table

  What is a partition, it is a table of data into a plurality of N blocks, these blocks can be on the same disk, or on a different disk

 

First, Let me talk about why the points table

  When the data reaches a few million, you query once spent time becomes much, if there is joint inquiry, I think there might have died there. The purpose of the score sheet in this, reduce the burden of the database, shorten the query time.

Based on personal experience, the process of the implementation of a sql mysql as follows:

  1. Receiving sql;
  2. Sql into the queue in the queue;
  3. Execute sql;
  4. Returns the result.

  In the implementation of this process is the most time spent in the area?

    1. First, waiting time,
    2. Second, the execution time sql. In fact, these two is one thing, while waiting, there must sql execution. So we have to shorten the execution time sql.

   There is a mechanism in mysql table locks and row locks, why this mechanism to appear, in order to ensure data integrity, I give you an example, if there are two sql must modify the same data in the same table this time how to do it, is not two sql can modify this data while it? Obviously mysql deal with this situation is that a table is locked (myisam storage engine), a row is locked (innodb storage engine). You can not lock the table represents operate on this table, so I have to finish the job on the operating table. Line lock, too, I have to wait another sql operation finished these pieces of data, in order to operate these pieces of data. If the data is too much, too long time to execute once, the longer the waiting time, and this is the reason why we want to divide the table.

 

Second, the points table

  1, do mysql cluster, for example: use mysql cluster, mysql proxy, mysql replication, drdb etc.

  Some people may ask mysql cluster, the root partition table have anything to do? While on the score sheet it is not practical significance, but it is open to the role of the points table, what is the meaning of the cluster it? Reduce the burden to a database, it means reducing the number of sql sql queue queuing, for example: There are 10 sql request, if placed in a database server queue queue, he has to wait a very long time, if this 10 sql request, assigned to the queue line up five of the database server, a database server queue, only two, so the waiting time is not greatly shorten it? This is already evident. So I put it out to the table within the range of points, and I did some mysql cluster:

linux mysql proxy installation, configuration, and separate read and write

mysql replication mutual master-slave configuration and installation, and data synchronization

Advantages: scalability, no complex operation (php code) of the plurality of sub-table

Disadvantages: the amount of data in a single table is not changed, the time spent in one operation, or so many large hardware overhead.

 

  2, large amount of data to estimate in advance and will be frequently accessed tables will be divided into several tables

  This forecast big bad not bad, the forum inside the posting of the table, this table a long time must be very large, hundreds of thousands, millions are likely. Chat inside information table, a chat with dozens of people a night, a long time, the data for this table must be very large. Like many such cases. Therefore, this data can be projected out of the large scale, we will advance the separation of a table of N, N is the number, according to the actual situation. To chat information table, for example:

  I previously built 100 such tables, message_00, message_01, message_02 .......... message_98, message_99. And then to determine which tables inside the user's chat messages into the user's ID, you can use the way to get hash can be used to obtain the remainder way, many ways, everyone wants everyone in it. The following method to obtain a hash table:

<?php  
function get_hash_table($table,$userid) {  
 $str = crc32($userid);  
 if($str<0){  
 $hash = "0".substr(abs($str), 0, 1);  
 }else{  
 $hash = substr($str, 0, 2);  
 }  
  
 return $table."_".$hash;  
}  
  
echoget_hash_table ( 'the Message', 'user18991');      // result is message_10   
echo get_hash_table ( 'the Message', 'user34523');     // result is message_13   
>?  

  Explain the above method, tell us user18991 this user messages are logged in message_10 this table, user34523 the user's messages are logged in message_13 this table, when read, read from their table as long as take on the line.

 

Advantages: Avoid a table appeared millions of pieces of data, reducing the execution time of a sql

Disadvantages: When a rule to determine, breaking this rule will be very troublesome, hash algorithm above example I use crc32, if I do not want to use this algorithm, the use md5 later, the same user will make the news is stored in different tables, so that the data mess. Scalability is poor.

 

3, using the merge storage engine to achieve sub-table

  I think this method is suitable for those without prior consideration, and there have been terrible, data query slow situation. This time if we want to separate from the existing large scale data more painful, the most painful thing is to change the code, because the program inside the sql statement has been written, a table now to be divided into dozens of tables, or even hundreds of table, so sql statement is not to rewrite it? For example, I like to give the child

mysql> show engines; when in fact you will find mrg_myisam merge.

mysql> CREATE TABLE IF NOT EXISTS `user1` (  
 ->   `id` int(11) NOT NULL AUTO_INCREMENT,  
 ->   `name` varchar(50) DEFAULT NULL,  
 ->   `sex` int(1) NOT NULL DEFAULT '0',  
 ->   PRIMARY KEY (`id`)  
 -> ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;  
Query OK, 0 rows affected (0.05 sec)  
  
mysql> CREATE TABLE IF NOT EXISTS `user2` (  
 ->   `id` int(11) NOT NULL AUTO_INCREMENT,  
 ->   `name` varchar(50) DEFAULT NULL,  
 ->   `sex` int(1) NOT NULL DEFAULT '0',  
 ->   PRIMARY KEY (`id`)  
 -> ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;  
Query OK, 0 rows affected (0.01 sec)  
  
mysql> INSERT INTO `user1` (`name`, `sex`) VALUES('张映', 0);  
Query OK, 1 row affected (0.00 sec)  
  
mysql> INSERT INTO `user2` (`name`, `sex`) VALUES('tank', 1);  
Query OK, 1 row affected (0.00 sec)  
  
mysql> CREATE TABLE IF NOT EXISTS `alluser` (  
 ->   `id` int(11) NOT NULL AUTO_INCREMENT,  
 ->   `name` varchar(50) DEFAULT NULL,  
 ->   `sex` int(1) NOT NULL DEFAULT '0',  
 ->   INDEX(id)  
 -> ) TYPE=MERGE UNION=(user1,user2) INSERT_METHOD=LAST AUTO_INCREMENT=1 ;  
Query OK, 0 rows affected, 1 warning (0.00 sec)  
  
mysql> select id,name,sex from alluser;  
+----+--------+-----+  
| id | name   | sex |  
+----+--------+-----+  
|  1 | 张映 |   0 |  
|  1 | tank   |   1 |  
+----+--------+-----+  
2 rows in set (0.00 sec)  
  
mysql> INSERT INTO `alluser` (`name`, `sex`) VALUES('tank2', 0);  
Query OK, 1 row affected (0.00 sec)  
  
mysql> select id,name,sex from user2  
 -> ;  
+----+-------+-----+  
| id | name  | sex |  
+----+-------+-----+  
|  1 | tank  |   1 |  
|  2 | tank2 |   0 |  
+----+-------+-----+  
2 rows in set (0.00 sec)  

  From the above operation, I do not know if you have found something? If I have a user table user, there are 50W of data, now split into two tables user1 and user2, 25W each table of data,

INSERT INTO user1(user1.id,user1.name,user1.sex)SELECT (user.id,user.name,user.sex)FROM user where user.id <= 250000

INSERT INTO user2(user2.id,user2.name,user2.sex)SELECT (user.id,user.name,user.sex)FROM user where user.id > 250000

  So I would succeed a user table, divided into two tables, this time there is a problem, the code of how to do sql statement, used to be a table, now turned into two tables, the code is changed greatly, this gives the programmer a big workload, there is no good way to solve this? Way is to back up what the previous user table, and then removed, the above operation alluser I created a table, only the table name into the table alluser user on the line.

However, not all operations can be used mysql

  • If you use alter table to table to merge into a table of other types, the mapping to the underlying tables is lost. Instead, the rows from the underlying myisam table is copied to the replaced table, then the table is assigned a new type.
  • Some say the Internet to see replace does not work, I tried it can play a role. A first Halo
mysql> UPDATE alluser SET sex=REPLACE(sex, 0, 1) where id=2;  
Query OK, 1 row affected (0.00 sec)  
Rows matched: 1  Changed: 1  Warnings: 0  
  
mysql> select * from alluser;  
+----+--------+-----+  
| id | name   | sex |  
+----+--------+-----+  
|  1 | 张映 |   0 |  
|  1 | tank   |   1 |  
|  2 | tank2  |   1 |  
+----+--------+-----+  
3 rows in set (0.00 sec)  

 

  • A merge table can not maintain the unique constraints on the whole table. When you perform an insert, data enter the first or last myisam table (depending on the value insert_method option). mysql ensure that unique key values ​​remain unique in that myisam table, but not across all the tables in the collection.
  • When you create a table of the merge, check to make sure there is no underlying table and have the same institution. When the merge table is used, the record length check for each table mysql mapped equal, but this is not very reliable. If you never like myisam table to create a merge table, you are very likely to run into strange problems.

Pros: good scalability, and the program code is not very big changes

Disadvantages: the effect of this method than the second to almost

 

Third, sum up

  Three methods mentioned above, I actually did two kinds, the first and second. A third have not done, so that's a little thin. Haha. Doing have a degree, more than a degree to become too poor, do not blindly database server cluster hardware is to spend money, and do not blindly points table, points out the 1000 tables, stored in mysql return to their roots in the end also to the above situation in the hard disk file, a file corresponding to the three tables, the table 1000 is the corresponding sub-files 3000, so that it retrieval also become very slow. my suggestion is

Methods 1 and 2 to a combination of sub-table

Methods 1 and 3 to a combination of sub-table

I have two suggestions for different cases, depending on individual circumstances, I think the way many people will choose methods 1 and 3 combined

 

 

Two, mysql sub-partition table and what difference does it

1, the implementation

  •  mysql part table is the real part table, a table is divided into many tables after every small table is a table completely positive, it corresponds to three files, one .MYD data file, the index file .MYI, .frm table structure file.
[root@BlackGhost test]# ls |grep user   
alluser.MRG   
alluser.frm   
user1.MYD   
user1.MYI   
user1.frm   
user2.MYD   
user2.MYI   
user2.frm  

 

[root@BlackGhost test]# ls |grep user  
alluser.MRG  
alluser.frm  
user1.MYD  
user1.MYI  
user1.frm  
user2.MYD  
user2.MYI  
user2.frm  

 

  Briefly explain the above table it is the use of sub-merge storage engine (a kind of sub-tables), alluser is the summary table, there are two points below the table, user1, user2. They are two separate tables, take the time data, we can get through the summary table. Here is a summary table no .MYD, .MYI these two documents, that is to say, he is not a table summary table, there is no data, data on the points table inside. We take a look in the end is what .MRG

[root@BlackGhost test]# cat alluser.MRG |more   
user1   
user2   
#INSERT_METHOD=LAST  

 

[root@BlackGhost test]# cat alluser.MRG |more  
user1  
user2  
#INSERT_METHOD=LAST  

 

We can see from the above, alluser.MRG some relationship keep the inside part table, and the data is inserted. It can be understood as a summary table of the housing, or coupling cell.

  •  Zoning is not the same, after a large table partition, he was a table, the table does not become two, but he blocks stored data becomes more.
[in to the root @ BlackGhost the test] # ls is | the grep 'aa    
' aa # # p1MYD the P    
'aa # # p1MYI the P    
' aa # # p3MYD the P    
'aa # # p3MYI the P    
aafrm    
aapar  

 

[in to the root @ BlackGhost the test] # ls is | the grep 'aa   
' aa # # p1MYD the P   
'aa # # p1MYI the P   
' aa # # p3MYD the P   
'aa # # p3MYI the P   
aafrm   
aapar  

 

  We can see from the above, aa this table is divided into two zones, p1 and p3, originally three areas, was I deleted a zone. We all know that a table corresponding to the three documents .MYD, .MYI, .frm. Partition it according to certain rules data files and index files were divided, but also more of a .par file, open the .par file you can see his record, partition information this table, the root partition table .MRG bit like. After partition, or one, rather than multiple tables.

 

2, data processing

  • After sub-table, the data points are stored in the table, a summary table just housing, accessing data in a place in which one part table. See the following example: select * from alluser where id = '12 'surface, is alluser operating table, they do not. Which is a sub-table alluser operation.
  • Partition it, there is no concept of partition table, partition, but the files are stored data is divided into many small pieces, table, or is it a table after partition. Data processing is done by themselves.

3, to improve performance

  • After the part tables, single table concurrency improves, disk I / O performance is improved. Why concurrency to improve it, because it takes time to search for a shortened, if high concurrency appear, the total table according to different queries, concurrent pressure into different small tables inside. Disk I / O performance out how high it, would have been a very large file .MYD .MYD now allocated to each small table to go.
  • mysql proposed the concept of partition, I think wanted to break through the disk I / O bottlenecks, want to improve literacy disks to increase performance mysql.
  • At this point, the focus of measuring the partition table and points of different points table when the focus is on data access, mysql on how to improve concurrency; and partition it, how to break the disk reading and writing skills, so as to improve the performance of mysql.

4), the difficulty to achieve

  • There are many ways part table, to merge with a sub-table, it is the simplest way. This manner similar difficulty root partition, and program code, it can be done transparent. If you are using other sub-table way trouble the score zone.
  • Partition implementation is relatively simple, the establishment of the partition table, root build the usual table makes no difference, and end off the code is transparent.

Three, mysql sub-tables and partitions What is the connection it

  1. Can improve high-mysql, there is a well in a highly concurrent state of the surface.
  2. Points table and partitions are not contradictory, can cooperate with each other for the big number of visits, and more table data table, we can take the form of sub-tables and partitions combined (if merge this sub-table mode, can not partition with words you can use the other sub-table test), little traffic, but a lot of table data table, we can take the partition of way, etc.

 

Reproduced in: https: //www.cnblogs.com/mjorcen/p/3976829.html

Guess you like

Origin blog.csdn.net/weixin_30284355/article/details/94784427