Mysql query performance optimization (1)

1. The life cycle of a query

    The life cycle of a query can be roughly viewed in sequence: from the client, to the server, and then to the server for parsing, generating an execution plan, executing, and returning the result to the client. Among them, "execution" is the most important stage of the entire life cycle, which includes a large number of calls to retrieve data to the storage engine and data processing after the call, including sorting, grouping, etc.

    When completing these tasks, queries need to spend time in different places, including network, CPU calculation, generating statistics and execution plans, lock waits (mutual exclusion waits) and other operations, especially calls to retrieve data from the underlying storage engine , these calls need to spend time on memory operations, CPU operations, and i/o operations caused by insufficient memory.

2. Optimize data access    

    1. Reduce the request for unnecessary data and only query only the required fields

    Requesting redundant data places additional load on the MySQL server, increases network overhead, and consumes application server CPU and memory resources.

    For example: "select *" to take out all columns will make the optimizer unable to complete the optimization such as index coverage scan, and will also bring additional i/o, memory and cpu consumption to the server

    2. Is MySQL scanning additional records?

    MySQL measures query overhead by three metrics: response time, number of rows scanned, number of rows returned

    In the case of an index, the number of rows to be scanned for querying a piece of data is about 10 rows. In the case of no index, the query may result in a full table scan. In addition, we can use the where condition in the query to filter the mismatched rows. Recording, which is done at the storage engine layer.

3. The way to reconstruct the query

    1. One complex query or multiple simple queries? MySQL by design makes connecting and disconnecting lightweight and efficient at returning a small query result. And in some versions of mysql, even on a general server, it can run more than 100,000 queries per second, and even a gigabit network card can easily meet more than 2k queries per second. So running multiple small queries is not a big problem now.

    2. Split query:

    Sometimes for a large query, we need to "divide and conquer", divide the large query into small queries, each query has the same function, only a small part is completed, and only a small part of the query results are returned each time, such as the example of deleting a large amount of data :

rows_affected = 0
do{
   rows_affected = do_query(
"delece"from messages where created < date_sub(now(),interval 3 month)
)while rows_affected > 0

This approach can reduce the impact on MySQL as much as possible, and reduce the one-time pressure on the server and spread it over a long period of time.

3. Decompose the associated query

   Many high-performance applications decompose relational queries. Simply, a single-table query can be performed on each table, and the results can be correlated in the application.

    eg:

SELECT * FROM tag JOIN tag_post ON tag_post_id=tag_id JOIN post ON tag_post.post_id=post.id WHERE tag.tag = 'mysql';

    The above statement can be broken down into the following queries instead:

    

SELECT * FROM tag WHERE tag = 'mysql';
SELECT * FROM tag_post WHERE tag_id = 1234;
SELECT * FROM post WHERE post.id in (123,456,567,9098,8904);

The benefits of this decomposition are: 

1. Make the cache more efficient. For example, the content with IDs 123, 567, 9098 has been cached above, then several IDs can be used in the IN() of the third query

2. After decomposing the query, executing a single query can reduce lock competition

3. Using IN() instead of association query allows MySQL to query in ID order, which may be more efficient than random association

4. It can reduce the query of redundant records. When doing related queries in the database, it may be necessary to repeatedly access part of the data. From this point of view, such reconstruction can also reduce network and memory consumption.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324801892&siteId=291194637