REVIEW
- In development will use the number of rows in a table of statistics, such a trading system, the boss will let you generate a report every day, these statistics are indispensable in sql count function.
- But as more and more records, query speed will become slower, Why is this so? Internal Mysql in the end is how to deal with?
- Today article from Mysql for internal
count
functions is how to deal with? - This article first appeared in the public micro-channel number [of code] ape Technology Column Mysql Performance Optimization: Why do you count (*) so slow? , The original is not easy, like your support, thank you! ! !
count of implementation
- Different storage engines in Mysql in the
count
function has different implementations. MyISAM
Engine the total number of rows of a table exists on the disk, so the executioncount(*)
time will be returned directly this number is high (inefficientwhere
query).InnoDB
The total number of the engine and not directly on disk, in the executioncount(*)
time required function line by line data is read out, and then the cumulative total.
Why will not the total number of InnoDB save up?
-
Speaking InnoDB believe that readers will always think of their transactional features, the transaction has isolation, if the total number of save up, how to ensure consistency between the total number of individual transactions it? Figure do not understand
-
事务A
And事务B
thecount(*)
results of the implementation is different, so InnoDB engine returns in each transaction is uncertain how many rows can only be used to determine the total number of read out line by line.
How to improve the efficiency of count
- In
InnoDB
how to improve thecount(*)
query efficiency, there are a variety of online solutions here introduces three kinds of analysis and feasibility.
show table status
show table status
This command can quickly check out the number of rows in each table in the database, but it really can replacecount(*)
it?- The answer is no. The reason is simple, this command is a statistical value out of "Valuation" , and therefore is not accurate, official documents say probably in error
40%-50%
. - Therefore, this method directly pass, but also inaccurate doing with it.
The total number of system cache memory
-
This method is also the most likely to think, to increase his party
+1
, deleted row-1
, and read caching system is fast, simple and convenient Why not? -
Cache system and Mysql are two systems, for example
redis
, andMysql
these two are typical comparison. The two systems is the most difficult in a highly concurrent can not guarantee data consistency. -
Through the above two graphs, both
redis计数+1
or theinsert into user
first implementation, will eventually lead to data inconsistency logically. FIG occur firstredis计数
second Although FIG counted correctly but did not check out the row of data into less. -
In which concurrent system, we can not precisely control the timing different threads of execution, since the presence of such a sequence of operations in FIG. Therefore, even if we say that the imprecise Redis normal operation, this count value is logic.
Save count in the database
-
Analysis saved through the use of caching caching system that can not guarantee the consistency of data in a logical, so we thought of using a database to store directly, with the "Transaction" support, will ensure the consistency of the data.
-
How to use it? Very simple, direct count stored in a table
(table_name,total)
. -
As the cache only need to perform logic system
redis计数+1
intototal
field + 1 can, as shown below: -
Because in the same transaction, to ensure the consistency of data in the logic.
Different usage count
count()
Is a function of the polymerization, the results set returned, line by line to determine if the count function parameter is not NULL, the cumulative value is increased by one, or without. Finally, the cumulative return value.count
There are a variety of uses, respectivelycount(*)
,count(字段)
,count(1)
,count(主键id)
. So a variety of uses, in the end what is the difference? Of course, "the premise is notwhere
conditional statements」 .count(id)
: InnoDB engine will traverse the entire table, the id values for each row are taken out, returned to the server layer. Get the server layer id, judgment is unlikely to be empty, the accumulated row.count(1)
: InnoDB engine traverse the entire table, but not the value. For each row returned server layer, put a digital1
into, determination is impossible to empty the accumulated row.count(字段)
: :count(*)
Will not take out all the fields, but specifically optimized, not value.count(*)
Certainly not null, rows accumulate.- If the "field" is defined as a
not null
word, line by line read out from the record field inside, is determined not null, the cumulative rows; - If this field is defined as allowed
null
, then the implementation of the time, the judge might be null, but also to determine what value is taken out again, not only accumulate null.
- If the "field" is defined as a
- So the conclusion is simple: "According to sort the words efficiency,
count(字段)
<count(主键id)
<count(1)
≈count(*)
, so the reader is advised to make use ofcount(*)
." - "Note" : Here surely someone will ask,
count(id)
is not taking the index it, and why query efficiency and other similar it? Here to explain Chen, although taking the index, but still can be scanned line by line figured out the total.
to sum up
MyISAM
Although the tablecount(*)
quickly, but does not support transactions;show table status
Although the command to return soon, but not accurate;InnoDB
Directcount(*)
will traverse the whole table (where no condition), although accurate, but can cause performance problems.- Cache memory counting system is simple, although high efficiency, but can not guarantee data consistency.
- Database holds the count is very simple, but also to ensure data consistency is recommended.
- "Questions, reader comments area to discuss" : In the case of high concurrency of the system, using the database stored count, is the first
更新计数+1
, or the first插入数据
. That is the firstupdate total+=1
or the firstinsert into
.