Table of contents
background
When looking at the code today, I saw a more complicated sql statement. I know the final result of this sql statement, so I want to write sql to achieve it myself to see if the speed will be better, and then find out Other people's sql writing is more efficient, which opens up a new world.
need
Query the user and other message senders' "last unread message id, latest unread message content, latest unread message receiving time, number of unread messages, message sender id, message sent" in the chat message record table name of the author". Where the table name is msg_record
and the relevant fields are as follows:
column name | explain |
---|---|
id | message record id |
toId | message receiver id |
fromId | message sender id |
fromUsername | message sender name |
date | message sent date |
hasRead | whether read |
msgText | Message content |
Inefficient sql writing
demand analysis:
The results to be queried include "the id of the latest unread message, the content of the latest unread message, and the receiving time of the latest unread message". For the "receiving time of the latest unread message", the aggregate function max() can be used , but the remaining two are textual content, which can only be obtained by sorting. For Mysql database, if we get non-grouped fields, then Mysql database will return the first item in the group, so use this feature , we will perform sorting first, then group, and then use the above characteristics of Mysql to complete the data acquisition work
sql statement:
SELECT
a.id,
a.fromId,
a.fromUsername,
a.date,
a.msgText lastMsg,
count( a.id ) unReadCount
FROM
(
-- 子查询
SELECT
id,
fromId,
fromUserName,
date,
msgText
FROM
msg_record
WHERE
fromId != "4ebd6f3485f140888ecc25c12e5105b1"
AND toId = "4ebd6f3485f140888ecc25c12e5105b1"
AND hasRead = FALSE
ORDER BY
date DESC
) a
GROUP BY
a.fromId
Perform analysis:
Just add fields in front of the above sql EXPLAIN
, let's look at the analysis results:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | PRIMARY | <derived2> |
ALL | null | null | null | null | 7 | Using temporary; Using filesort |
2 | DERIVED | imq_server_messagerecord | ALL | IDX_IMQ_SERVER_MESSAGERECORD_TOID,IDX_IMQ_SERVER_MESSAGERECORD_FROMID | IDX_IMQ_SERVER_MESSAGERECORD_TOID | 403 | null | 117 | Using where; Using filesort |
The second row above means that the subquery uses a full table scan, but uses an index; the first row represents a full table scan, but does not use an index, both of which are full table scans, indicating that the efficiency is really not high
After testing, it is found that the execution time is about 0.024 seconds
Efficient sql writing
demand analysis:
Since the way of sorting first does not work, then we use the method of grouping first and then sorting to get the final result. This requires the use of SUBSTRING_INDEX, GROUP_CONCAT, and CONCAT functions. Let’s explain the meaning of these functions:
- SUBSTRING_INDEX: String segmentation function
- GROUP_CONCAT: String grouping connection function
- CONCAT: String concatenation function
Detailed analysis can be seen: mysql fetches the latest records in each group (don't step on the pits)
sql statement:
SELECT
SUBSTRING_INDEX( GROUP_CONCAT( CONCAT( id, '*splits*,' ) ORDER BY date DESC ), '*splits*,', 1 ) id,
fromId,
fromUsername,
max( date ) date,
SUBSTRING_INDEX( GROUP_CONCAT( CONCAT( msgText, '*splits*,' ) ORDER BY date DESC ), '*splits*,', 1 ) lastMsg,
count( id ) unReadCount
FROM
msg_record
WHERE
toId = "4ebd6f3485f140888ecc25c12e5105b1"
AND fromId != "4ebd6f3485f140888ecc25c12e5105b1"
AND hasRead = FALSE
GROUP BY
fromId
Perform analysis:
Just add fields in front of the above sql EXPLAIN
, let's look at the analysis results:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | IMQ_SERVER_MESSAGERECORD | ref | IDX_IMQ_SERVER_MESSAGERECORD_TOID,IDX_IMQ_SERVER_MESSAGERECORD_FROMID | IDX_IMQ_SERVER_MESSAGERECORD_TOID | 403 | const | 51 | Using where; Using filesort |
It can be seen that the ref scan is used, and the index is used, and it is a constant type index, which is definitely faster.
After testing, it is found that the execution time is about 0.015 seconds
Let's explain the use of the above functions. We first use the CONCAT() function to id
concatenate the strings of the attributes, and then use the GROUP_CONCAT() to sort by time and then concatenate the strings. This is a big one. string, and then use SUBSTRING_INDEX() to intercept the data according to the delimiter, and you can get the result
However, this processing method also has a disadvantage. If the amount of data is large, memory leaks may occur through these functions to perform related operations, but most of the cases are fine.