MySQL actual 45 study notes say: Why temporary table can be the same name? (Lecture 36)

First, the primer

Today is New Year's Eve, before we start learning today, you and I have to channel sound Happy New Year!

In the previous article, we use the time to optimize join queries to the temporary table. At that time, we are so used:

create temporary table temp_t like t1;
alter table temp_t add index(b);
insert into temp_t select * from t2 where b>=1 and b<=2000;
select * from t1 join temp_t on (t1.b=temp_t.b);

You may be in doubt, why use a temporary table it? The direct use of ordinary table is not also allowed to do so?

Today we start with this question: What are the characteristics of a temporary table, and why it is suitable for this scenario?

Here, I need to help you clarify the question a misunderstood: Some people might think that the temporary table is the memory table. However, the two concepts, but completely different.

Memory table , refers to the use of table Memory engine, built table syntax create table ... engine = memory. This data tables are stored in memory, when the system is restarted will be cleared, but the table
structure is still there. In addition to these two features seem more "strange", the other features from the point of view, it is a normal table.

The temporary table , you can use a variety of engine types. If you are using InnoDB engine or temporary tables MyISAM engine, when the write data is written to disk. Of course, you can also use temporary tables Memory engine

Figure out the difference between memory and temporary tables in the future, let us look at what characterizes a temporary table.

Figure out the difference between memory and temporary tables in the future, let us look at what characterizes a temporary table.

Second, the characteristics of a temporary table

For ease of understanding, we look at the following sequence of operations:

Example 1 FIG characteristic temporary table

Can be seen, the temporary table has the following features are used:

1. The construction of the table syntax is create temporary table ....
2. A temporary table can only be accessed session that created it, not visible to other threads. Therefore, in FIG temporary table created session A t, for the session B is not visible.
3. Temporary table can be the same name as an ordinary table.
There are temporary tables with the same name and in the general table 4. session A time, show create statements, and CRUD statements to access the temporary table.
5. show tables command does not display the temporary table.

Because temporary tables can only be accessed session that created it, so when this session ends, the temporary table is automatically deleted. It is due to this feature, the temporary table is particularly suitable for the beginning of the article, join us optimize this scene. Is
it?

The main reasons include the following two aspects:

1. temporary table is different session of the same name, if there are multiple session simultaneously perform join optimization, do not worry about repeating table names cause problems to build the table failed.
2. do not need to worry about data deletion problem. If a common table, in the process execution client disconnection abnormality occurs, or abnormal restart the database occurs, but also specifically to clean the intermediate data generated during the table. Due to the temporary table will automatically recover, so do not need this additional operations.

Application to the provisional table

1, sub-library sub-table sketch

Do not worry because the same name conflicts between threads, temporary tables are often used in the optimization process complex queries. Among them, the sub-library sub-table cross-database query system is a typical usage scenario.

Usually the scene dispersion of sub-library sub-table, it is to a large table to different logical database instance. such as. A large table ht, in accordance with the field F, split into sub-tables 1024 and then distributed to the instance database 32.

As shown below:

FIG 2 a schematic partial library sub-table

In general, this sub-library system has a sub-table intermediate layer proxy. However, there are a number of programs to make the client directly to connect to the database, that is, no proxy this layer.

In this architecture, the partition key choice is to "reduce cross-database and cross-table queries" is based. If most of the statement will include the equivalent conditions of f, then you would do with the partitioning key f. In this way, the proxy parses SQL finish this layer
later statements, these statements can determine which points routing table to do the query.

For example, the following statement:

select v from ht where f=N;

At this time, we can confirm that the required data which was on the score sheet by sub-table rules (for example, N% 1024). This statement requires only access a sub-table, a sub-library sub-table programs most popular form of the statement.

However, if there is another table on the index k, and the query is this:

select v from ht where k >= M order by t_modified desc limit 100;

At this time, since there is no use query partition field f, only to find all of the partitions to meet the conditions of all the rows, and then do the order by unified action. In this case, there are two relatively common ideas.

2, a first idea is realized in the process of sorting proxy layer code.

The first idea is that in the process of proxy layer to achieve the sort of code.

The advantage of this approach is the fast processing speed, to get the data points after storage were directly involved in the calculation in memory. However, the disadvantage of this solution is also more obvious:

1. The development effort required is relatively large. We illustrate this statement is still relatively simple, if it comes to complicated operations, such as group by, or even join such an operation, the development capability of the intermediate layer is relatively high;
2. Pressure on the proxy terminal is relatively large, especially the problem is not enough memory and CPU bottlenecks easily appear.

3, another idea is, to get respective sub-library data, a summary table to a MySQL instance, and then do a logical operation on this example are summarized.

Another idea is, to get respective sub-library data, a summary table to a MySQL instance, and then do a logical operation on this example are summarized.

This statement such as the above, the flow of execution may be something like this:

 select v from ht where k >= M order by t_modified desc limit 100;

Summary library created on a temporary table temp_ht, table contains three fields v, k, t_modified;

Performed on the various sub-library

select v,k,t_modified from ht_x where k >= M order by t_modified desc limit 100;

The sub-library results performed temp_ht inserted into the table;

carried out

select v from temp_ht order by t_modified desc limit 100; 

got the answer.

This process corresponding to the flowchart as follows:

Figure 3 schematic flow cross-database query

In practice, we often find that the amount of computation for each sub-library is not saturated, it will be directly placed on the temporary table temp_ht 32 points in one library. At this time of the query logic is similar to Figure 3, you can then think about their own specific process.

Fourth, why the temporary table can be the same name?

1, different threads can create a temporary table with the same name, this is how to do it?

You may ask, different threads can create a temporary table with the same name, this is how to do it?

Next, we look at this issue.

In carrying out our

create temporary table temp_t(id int primary key)engine=innodb;

When this statement, MySQL InnoDB tables give this to create a frm save the table structure definition file, but also a place to store the table data.

This frm file in a temporary file directory, the file name suffix is ​​.frm, the prefix is ​​"#sql {process id} _ {thread id} _ serial number." You can use select @@ tmpdir command to display the temporary file directory of the instance.

And on the way to store data in the table, we have a different approach in the different versions of MySQL:

  • In version 5.6 and earlier, MySQL creates temporary files in a directory with the same prefix, suffix to .ibd files, used to store data files;
  • From the start version 5.7, MySQL introduced a temporary table space, designed to store temporary data files. Therefore, we do not need to create ibd files.

From the file name prefix rule, we can see that, in fact, create a temporary table called t1 InnoDB is, MySQL storage believe we have created in the name of the table with ordinary table t1 is different, so the same library already has ordinary table below t1
in the case, or you can re-create a temporary table t1.

2, different threads can create a temporary table with the same name verification process

For ease of discussion later, I'll give you an example.

Table temporary table in Figure 4

This process is the process ID 1234, thread id session A is 4, thread id session B is 5. So you see, a temporary table session A and session B created files on the disk will not be the same name.

MySQL data table to maintain, in addition to the file must be physically inside the memory also have a mechanism to distinguish different tables, each table corresponds to a table_def_key.

  • A value table_def_key of ordinary table is from the "library name table name +" get, so if you want to create a common table two with the same name in the same library, create a second table of the process will find table_def_key already exists a.
  • For temporary tables, table_def_key in the "library name + table name" basis, but also joined the "server_id + thread_id".

In other words, two temporary tables t1 session A and sessionB created, they have different table_def_key, disk file name is different, so can co-exist .

In the realization that each thread maintains its own list of temporary tables. So each time the operating table in the session, before traversing the list, check for the name of a temporary table, if there is a temporary operation on the priority list, if there is no longer operating Pu
Tong table; when the end of the session, each of the linked list temporary table, perform "DROP TEMPORARYTABLE + table name" operation.

This time you will find, binlog also recorded DROP TEMPORARY TABLE this command. You must be wondering, they can only access the temporary table in the thread, why the need to write to binlog inside?

This will require when it comes to standby copy.

V. Copy temporary tables and primary and

1, if the operation is not recorded on temporary tables, prepared by the library at the time of execution to insertinto t_normal, the error will be "table temp_t does not exist"

Since writing binlog, it means that by the library needs.

You can imagine, perform the following sequence of statements in the main library:

create table t_normal(id int primary key, c int)engine=innodb;/*Q1*/
create temporary table temp_t like t_normal;/*Q2*/
insert into temp_t values(1,1);/*Q3*/
insert into t_normal select * from temp_t;/*Q4*/

If the operation is not recorded on the temporary table, then the standby database only create table t_normal table and insertinto t_normal select * from temp_t binlog log of these two statements, prepared to insert libraries in execution
time, the error will be "table temp_t does not exist".

You might say, if the row is set to binlog format just fine, right? Binlog row format is because, when the insert into binlog t_normal recording, the recording of the data of this operation, namely: write_rowevent
 logic which is recorded in "to insert a row (1,1)."

2, if the row is set to binlog format just fine, right?

Indeed it is. If the current binlog_format = row, then the temporary table associated with the statement, it will not be recorded in the binlog. In other words, only in binlog_format = statment / mixed time, binlog will be credited in
the operation recorded a temporary table.

In this case, a temporary table creation statements will spread to the standby database to perform, and therefore synchronize threads library equipment will create the temporary table. Main library when the thread exits, it will automatically delete temporary table, but the standby database synchronization thread is continued in operation. The
order, this time we need to write a DROP TEMPORARY TABLE on the main library passed by the library to perform.

3, MySQL binlog at the time of drop tablet_normal record into a standard format. Why do you do that?

Before Someone asked me an interesting question: MySQL binlog in record time, either create table or alter table statements, as they are recorded, even with spaces remain unchanged. But if you perform drop tablet_normal,
the system will record binlog written:

DROP TABLE `t_normal` /* generated by server */

That is, into a standard format. Why do you do that?

Now you know why that is: drop table command can delete multiple tables. For example, in the above example, provided binlog_format = row, if the implementation of "drop table t_normal, temp_t" command on the primary database, then it can only record the binlog:

DROP TABLE `t_normal` /* generated by server */

Because there is no table temp_t on the standby database, this command will rewrite and then spread to the standby database to perform, will not cause the thread to stop by the library synchronization.

So, drop table command binlog record time, it is necessary to rewrite the statement to make. "/ * Generated byserver * /" illustrates this being rewritten is a server-side command.

4, different threads created with the same name on the primary database temporary table is okay, but the execution is passed by the library how to deal with it?

When it comes to primary and replicate, there is another problem to be solved: the main library of the same name in different threads to create a temporary table is okay, but the execution is passed by the library how to deal with it?

Now, I will give you an example, following the sequence in Example S is prepared by the library of M.

Operating in master and slave temporary table of FIG. 5

Two session on the primary library t1 M creates a temporary table with the same name, the two create temporary table t1 prepared statements will be transmitted to the library S.

However, the standby database application log thread is shared, that is to create the implementation of this statement has two threads in the application inside. (Even if opened multi-threaded replication, it may also be assigned to execute a worker from the same library). That
it, this will not lead to thread synchronization error?

Obviously not, otherwise the temporary table is a bug. In other words, prepared by the library thread in the course of implementation, should the two tables t1 as two different temporary table to deal with. This, is how to achieve it?

MySQL binlog at the time of recording, the main library will execute this statement written binlog thread id in. Thus, the library prepared in application threads can know each execution of the main thread id database statement, and use this to construct the thread id
made table_def_key temporary table:

T1 1. session A temporary table in table_def_key library is prepared: library name + t1 + "M of serverid" + "thread_id session A's";
temporary table t1 2. session B, the library is prepared in table_def_key: Library name + t1 + "serverid M of" + "session B of thread_id."

Due to the different table_def_key, so the two tables in a prepared application threads library which is not conflict.

VI Summary

Today this article, I introduce you to the usage and characteristics of temporary tables.

In practice, generally a temporary table for processing more complex logical calculations. Due to a temporary table is visible to each thread its own, so no need to consider multiple threads execute the same processing logic, duplicate names temporary table. In the thread exits
time, temporary tables can be deleted automatically, eliminating the need for finishing work and exception handling.

In binlog_format = 'row' when operating a temporary table is not recorded in the binlog, but also save a lot of trouble, this may also be a consideration when you choose binlog_format.

Note that, when it comes to us above this temporary table is created by the users themselves, it can also be called a user temporary table. And it corresponds, is the internal temporary tables, 17 in the first article I've introduced you.

Finally, I leave you with a thought to the bar.

The following sequence of statements is to create a temporary table, and renamed it:

6 Questions about the renaming of temporary table

We can see, we can use the alter table syntax to modify the temporary table table name, but can not use the rename syntax. You know what reason?

You can write your analysis in the comments section, I will end next article and you discuss this issue. Thank you for listening, you are welcome to send this share to more friends to read together.

Seven, the issue of time

The problem is that the previous period, to join the statement following three tables,

select * from t1 join t2 on(t1.a=t2.a) join t3 on (t2.b=t3.b) where t1.c>=X and t2.c>=Y and t3.c>=Z;

If the rewrite straight_join, how to specify join order, three tables and how to create an index.

The first principle is to try to use BKA algorithm. Note that, when using the BKA algorithm, not "to join the two tables of results, talk to a third table join", but directly nested queries.

Specific implementation is: t1.c> = X, t2.c> = Y, t3.c> = Z these three conditions, select After filtration through a minimum data table that, as the first drive table. In this case, the following two situations may occur.

The first case, if the table t1 is elected or T3, that the rest is fixed.

1. If the drive table is t1, the connection order is t1-> t2-> t3, the index is driven to the table fields are created, that is, create an index on t2.a and t3.b;

2. If the drive table is t3, the connection order is t3-> t2-> t1, we need to create an index on t2.b and t1.a. At the same time, we also need to create an index on the first field c drive table.

The second case is if the elected first drive table is a table t2, then the need to assess the filtering effect of the other two conditions.

In short, the whole idea is, try to make time to participate in every set of data join the drive table, the smaller the better, because we will drive smaller table.

Comments Guest Book thumbs board:

@ Tao Tao libraries do experiments;
@poppy students made a very good analysis;
@dzkk students about the MariaDB supports hash join in a review, you can look at;
@ Yang comrades asked a good question, if the statement use the index a, but also the results of a sort, do not optimizes the MRR, or else back to the table but also complete additional sorting process, more harm than good.

Guess you like

Origin www.cnblogs.com/luoahong/p/11756090.html