45 MySQL combat stress study notes: say InnoDB Well, do you still want to use Memory engine? (Say 38)

First, this section

I leave you at the end of the article the question is: two group by statements by the order by null, why the use of statements obtained by the results of a temporary table in memory, the value 0 in the last line; the use of disk temporary tables get the result
, the value of 0 in the first row?

Today we take a look at the cause of the problem right there.

Second, the memory table data organization structures

1, two position query results -0

For ease of analysis, I put this question to simplify it, consider the following two tables t1 and t2, where t1 table using the Memory Engine, table t2 use InnoDB engine.

create table t1(id int primary key, c int) engine=Memory;
create table t2(id int primary key, c int) engine=innodb;
insert into t1 values(1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(7,7),(8,8),(9,9),(0,0);
insert into t2 values(1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(7,7),(8,8),(9,9),(0,0);

Then I performed separately select * from t1 and select * from t2.

Results 1 -0 two positions in FIG.

Can be seen, the memory table t1 returns the results of which 0 in the last line, and returns the result table t2 InnoDB 0 in the first row.

This distinction appears reasons, from talking about the organization of the primary key index of the two engines.

2, data organization table t2

Table t2 use the InnoDB engine, the organization of its primary key index id, you are already familiar: InnoDB data table on the primary key index on the tree, the primary key index is a B + tree. Therefore, data organization table t2 as follows

Figure 2 shows that data organization t2


Value of the primary key index is stored orderly. In performing select *, the result will be in accordance leaf nodes to scan from left to right, so get in, 0 appears in the first line.

3, data organization table t1

Unlike InnoDB engine, Memory engine data and index are separate. We look at the data in the table of contents t1.

FIG 3 data organization table t1

Can be seen, the data portion of the memory table as arrays stored separately, and the primary key index id, the position is stored for each data. Id is the primary key hash index, you can see the key on the index is not orderly.

In the memory table t1, when I execute a select *, take a full table scan, that is, the order of scanning the array. Therefore, a 0 is the last to be read, and the data into the result set.

4, data organization and Memory InnoDB engine is different

Visible, data organization and Memory InnoDB engine is different:

InnoDB engine data on the primary key index, the index is stored on other primary key id. In this way, we call the index-organized table (Index Organizied Table).

Memory engine uses the data stored separately in the form of data organization data stored position on the index, we call stack organization table (Heap Organizied Table)

We can see, some of the typical two different engines:

1. Data are always ordered InnoDB table stored, the memory table and the data is written in accordance with the order;
2. When a cavity when the data file, the new data is inserted in InnoDB tables when the data in order to ensure an orderly resistance, can be written in a fixed location of the new value, and the memory space can be inserted into the table to find the new value;
when the position 3. data changes, InnoDB table only need to modify the primary key index, the memory table to modify all indexes;
4 . InnoDB table needs to go query using the primary key index single key lookup, when the ordinary index queries, need to take two
5. InnoDB supports variable-length data type, length different recording may be different; memory tables do not support Blob and Text fields, and even defined varchar (N), as is also the actual char (N), which is fixed-length string is stored, so the same data length of each row of the memory table.

Due to these characteristics the memory table, each row of data is deleted later, this position vacated can be inserted next to the data multiplexing. For example, if you want to perform in table t1:

delete from t1 where id=5;
insert into t1 values(10,10);
select * from t1;

You will see the result in the return, id = 10 this line appears after id = 4, that is, the original position id = 5 line of data.

It should be noted that the table t1 of the primary key index is a hash index, so if you perform range queries, such as

select * from t1 where id<5;

Is no access to the primary key index, you need to take a full table scan. You can take this further review content Dir 4 article. If you want to support that memory table range scans, how should you do it?

Three, hash index and the index B-Tree

Data Organization 1, t1 - Added B-Tree index

In fact, the memory table also support B-Tree index. Create a B-Tree index on the id column, SQL statements, you can write:

alter table t1 add index a_btree_index using btree (id);

In this case, data is organized in the form of table t1 so becomes:

FIG 4 the data organization table t1 - B-Tree index increase

 

The new B-Tree index you look on the familiar, which is similar to a tree with the organization InnoDB index of b +.

2, B-Tree index and query results hash comparison

As a comparison, you can look at the output of these two statements that the following:

FIG. 5 using the B-Tree index and query results hash comparison

It can be seen perform select * from t1 where id <5 times, the optimizer selects the index B-Tree, so the result is 0-4. Use force index forced to use the primary key index id, id = 0 at the end of this line is the most result set.

In fact, in our general impression, the advantages of the memory table is speed, which is one of the reasons Memory engine supports hash indexes. Of course, more important reason is that all data in memory tables are stored in memory, and the memory
read and write speed is always faster than disk.

But then I want you to explain why I do not recommend you use a memory table in the production environment. Here's why mainly includes two aspects:

  • 1. The lock granularity problem;
  • 2. Data persistence problem.

Lock Four, the memory table

Let's say lock granularity problem of memory table.

Memory tables do not support row lock, only supports table locks. Therefore, as long as there is a table update will block all other read and write operations on this table.

Note that, here's a table lock with before we introduced the MDL lock different, but they are table-level locking. Next, I pass the following scenario, you simulate what with table-level locking memory table.

Table lock the memory table of FIG. 6 - reproducing step

In this execution sequence where, update session A statement to be executed 50 seconds, session B's query during the execution of this statement into the lock wait state. show processlist output session C as follows:

FIG lock table memory table 7 - Results

Table lock support for concurrent access is not good enough compared with the row lock. So, lock the memory table size problem, it decided in dealing with concurrent transactions, the performance is not good.

Five data persistence problem

Next, we look at the problem of persistent data.

Data in memory, is the advantage of the memory table, but it is also a disadvantage. Because the database is restarted, all the memory table will be empty.

You might say, if the database is abnormally restarted, the memory is cleared it cleared the table, there would be no problem ah. However, in high-availability architecture, this feature can be used as memory table is simply to look at the bug. Why do you say?

1, MS basic structure

Let's look at the architecture of MS, there is the problem of using the memory table.

8 MS basic architecture of FIG.

Let's look at the following sequence:

1. Normal service access the primary database;
2. Preparation library hardware upgrades, restarting the standby database, the contents of the memory table t1 is emptied;
3. After restarting the standby database, the client sends an update statement, modify the data line table t1, this time prepared library application threads will throw an error "can not find the line to be updated."

This will result in the primary and stopped in synchronization. Of course, if the switchover time this happens, the client will see data table t1 "lost".

In FIG 8 this has proxy architecture, everyone default standby switching logic is maintained by the database system itself. In this way the client, is "disconnected from the network, and then reconnect, find a memory table data is lost."

You might say that's good, after all standby switching occurs, the connection is dropped, the business end can sense the abnormality.

2, the structure M bis

But then this feature will make the memory table using the phenomenon appear more "strange" the. Because MySQL knows after the restart, the data memory table will be lost. So, the main library worried after the restart, standby appear inconsistent, MySQL in
doing such a thing on implementation: After the restart of the database, which is written to the binlog row DELETE FROMt1.

If you use a dual-M structure shown in Figure 9, then:

FIG nine pairs of structure M

When restarting the standby database, standby database binlog in the delete statement will spread to the main library, then the contents of main memory database tables removed. And you'll find when in use, the main library data memory table suddenly cleared.

3, I suggest you turn ordinary memory tables are replaced with InnoDB tables

Based on the above analysis, you can see that the memory table is not suitable in the production environment as ordinary data table use.

Some students will say, but the memory table execution speed detained. The problem, in fact, so you can analyze:

1. If your large table is updated, then the degree of concurrency is a very important reference index, InnoDB supports row locking, concurrency better than the memory table;

2. The amount of data that can be placed in the memory table is not large. If you consider that reading performance, reading a QPS high and the amount of data tables, even with InnoDB, the data also will be cached in InnoDB Buffer Pool Lane

Thus, the InnoDB table read performance is not bad.

So, I suggest you put an ordinary memory tables are replaced with InnoDB tables . However, there is an exception of a scene.

Scenario 4, the memory table

This scenario is, we are in the first 35 and 36 of the temporary table when it comes to the user. In a controlled amount of data, will not consume too much memory is available, you can consider using the memory table.

Memory temporary tables can just ignore two tables out of memory, mainly the following three reasons:

1. The temporary table can not be accessed by other threads, no concurrency issues;
2. temporary tables also need to remove the restart, clear data this problem does not exist;
temporary Table 3. library equipment will not affect the user's home library thread.

Now, we go back and look at the example of the 35 join statement optimization, then I would suggest is to create a temporary InnoDB table, using a sequence of statements:

create temporary table temp_t(id int primary key, a int, b int, index(b))engine=innodb;
insert into temp_t select * from t2 where b>=1 and b<=2000;
select * from t1 join temp_t on (t1.b=temp_t.b);

4, the effect of the implementation of the temporary memory table

Understand the characteristics of the memory table, you know, in fact, the use of temporary tables memory effect better here, for three reasons:

InnoDB Table 1. Compared to the use of memory tables without writing the disk, writing data to the table speed is faster temp_t;
2. b using the hash index indexes, look speed faster than B-Tree index;
3. Provisional table data only 2000 lines, a limited amount of memory.

So, you can sequence the first sentence of article 35 to do a rewrite, the temporary memory table t1 into a temporary table and create a hash index on the field b.

create temporary table temp_t(id int primary key, a int, b int, index (b))engine=memory;
insert into temp_t select * from t2 where b>=1 and b<=2000;
select * from t1 join temp_t on (t1.b=temp_t.b);

FIG 10 is performed using the temporary table memory effect

You can see, whether it is time to import the data, or join execution time, memory usage rate than the temporary table temporary table to use InnoDB faster.

VI Summary

Today this article, I launched from the "Do you want to use the memory table" this issue, and introduce you to several features Memory engine.

It can be seen due to the restart will lose data if a standby database is restarted, resulting in standby synchronization thread stops; if the main library with the library equipment is dual-M architecture, but also can cause memory data table the main library is deleted.

Therefore, in the production, I do not recommend that you use common memory table.

If you are a DBA, may increase such rules in the audit system construction of the table, the requirements of the business use InnoDB tables. We also analyzed in this paper, in fact, pretty good performance InnoDB tables, and data security is also guaranteed. The memory table because they do not
support row lock, update statement blocking queries, performance is not necessarily as good as imagined.

Based on the characteristics of the memory table, we also analyzed one of its application scenarios, memory is a temporary table. Memory tables support hash indexes, this feature is utilized, the effect of accelerating complex queries still very good.

Finally, I left you a question now.

Suppose you just took on a database, really found a memory table. After the standby database restart certainly cause a memory table data prepared by the library are empty, which led to the active and standby synchronization stops. In this case, the best practice is to change that to
InnoDB table engine

Was temporarily assume business scenario does not allow you to modify the engine, you can add what automation logic to avoid standby synchronization to stop it?

You can put your thinking and analysis written in the comments section, I will lower end of an article to discuss this issue with you. Thank you for listening, you are welcome to send this share to more friends to read together.

Seven, the issue of time

Today the text content of the article, we have answered the question period, will not repeat them here.

Comments Guest Book thumbs board

@ Yang comrades, @ poppy, @ Long Jie these three students gave the correct answer, but also continued to maintain a follow-up study during the Spring Festival, give you a thumbs up.

Guess you like

Origin www.cnblogs.com/luoahong/p/11753406.html