count MySQL in (*)

  • Use different storage engines

In MySQL, the daily development of the more commonly used MyISAM and InnoDB two storage engine. Wherein a difference between the two is the use count (*) function to calculate the specific number of rows in the table.

Because of the specific number of rows MyISAM tables will be saved, so this code is executed MyISAM storage engine, MyISAM simply read a good number of rows to save. Thus, if the table is not used like the transaction operation, this is the best optimization. However, InnoDB storage engine does not save a specific number of rows in the table, therefore, the implementation of this code in the InnoDB storage engine, InnoDB again to scan the entire table to calculate how many rows.

  • count (*) the implementation of the principle of function

In different storage engines, count (*) function is executed is different. In MyISAM storage engine, count (*) is a direct function of the number of rows read table and return the saved data, and the storage engine InnoDB, count (*) is a function to read data in the table memory between start buffer, and then get a full table scan line number recorded.

When coupled with the use condition where the count function, the effect of two storage engine is the same, it scans the whole table with a number of computing a field value of the item. When the number of rows in the result set of statistics with a WHERE clause, column values ​​can be a number of statistics, MyISAM the COUNT () and other storage engine is no different, no longer is a fabulous speed.

  • Use approximations

Sometimes does not need to completely accurate value of COUNT, an approximation can be replaced at this time. EXPLAIN out the number of rows the optimizer estimates is a good approximation, the implementation of EXPLAIN does not really need to execute the query, the cost is very low.

explain command is fast because not really explain with executing the query, the query optimizer [but] estimated the number of rows.

  • To distinguish between (different count usage)

    count (primary key id) 

      InnoDB engine will traverse the entire table, the id values ​​for each row are taken out, returned to the server layer. Get the server layer id, judgment is unlikely to be empty, the accumulated row.

    count(1)

      InnoDB engine traverse the entire table, but not the value. For each row returned server layer, put a number "1" into, determination is impossible to empty the accumulated row.

    count (1) than was executed count (primary key id) fast. Because the return operation id involve parsed data line, and the field value from the copy engine.

    count (field)

      If the "field" is defined as not null, then this line by line read out from the recording field inside, is determined not null, the cumulative rows;

      If this "field" is defined to allow null, then the implementation of the time, the judge might be null, but also to determine what value is taken out again, not only accumulate null. 

    But count (*) is an exception

      All the fields will not be taken out, but specifically optimized, not value. count (*) is certainly not null, rows accumulate.

    So the conclusion is: sorted efficiency, then, count (field) <count (primary key id) <count (1) ≈count (*), we recommended to make use count (*) or the count (1).

 

The following example be optimized count:

1. First we have to create innodb table, and contains a large field (or fields containing more):

 

CREATE TABLE `qstardbcontent` (
  `id` BIGINT(20) NOT NULL DEFAULT '0',
  `content` MEDIUMTEXT,
  `length` INT(11)  NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8

 

2. Insert the 500,000 data, each data 5K

 

3. Do select count (*) from qstardbcontent

 

It can be seen nearly 500,000 more content data to perform a count (*) would require time-consuming 13 minutes 28 seconds

Let's be optimized, add an index on the length field, perform sql: ALTER TABLE qstardbcontent ADD KEY (LENGTH);

 

After the completion of construction of the index, and then execute select count (*) from qstardbcontent;

 

We can see the entire statistical query is very fast, in just 354 milliseconds to complete the inquiry.

 

Published 21 original articles · won praise 0 · Views 2257

Guess you like

Origin blog.csdn.net/hfaflanf/article/details/103702675