SQL performance optimization comb

Mysql sort out briefly the basic concepts of the next, and then create sub-optimization of these two phases of expansion when and queries.

1 Introduction to Fundamental Concepts

1.1 Logical Architecture

 

  • First layer: client through connection service, sql instruction to be executed transmitted by

  • Second layer: server parsing and optimization sql, generate the final implementation plan and execute

  • Third layer: storage engine, is responsible for data storage and retrieval

1.2 Lock

Database to resolve concurrency scenarios by locking mechanism - a shared lock (read lock) and exclusive lock (write lock). Read lock is not blocked, multiple clients can read from the same resource at the same time. Write lock is exclusive, and will block other read and write locks. Simply put under optimistic and pessimistic locking.

  • Optimistic locking , typically used for data less competitive scene, reading and writing less, achieved by the version number and time stamp.

  • Pessimistic locking , typically used for data intense competition scene, each operation will lock the data.

To lock the data requires a certain locking strategy to cope with.

  • Table locks , lock the entire table, minimal overhead, but competition will intensify lock.

  • Row lock , row level locking, the most expensive, but the greatest degree of support for concurrency.

However, the storage engine MySql real implementation is not a simple row-level locking, usually to achieve a multi-version concurrency control (MVCC). MVCC is a variant of row-level locks, locking operation is avoided in most cases, lower cost. MVCC snapshot is achieved through a point in time to save data.

1.3 Transaction

Transactional guarantees atomic set of operations, either all succeed, or all fail. Should it fail, all operations before the rollback. MySql automatic submit, if not explicitly start a transaction, each query as a transaction.

Isolation levels control the modification of a transaction, which is visible and between transactions within a transaction. Four common isolation levels:

  • Uncommitted Read (Read UnCommitted), modify the transaction, even if not submitted to other matters also visible. Transaction may read uncommitted data, resulting in dirty read.

  • Read Committed (Read Committed), when a transaction begins, only to see the firm has been submitted modifications to do. Before uncommitted transactions, changes made to other transactions is not visible. Also known as non-repeatable read, I read many times the same record the same transaction may be different.

  • Repeatable read the same time (RepeatTable Read), reading the results of the same record multiple times results in the same transaction.

  • Serializable (Serializable), highest level of isolation, force the transaction serial execution.

1.4 Storage Engine

InnoDB engine, the most important and most widely used storage engine. It is designed to handle a large number of short-term transactions, having characteristics of high performance and automatic crash recovery.

MyISAM engine does not support transactions and row-level locking, security can not be restored after a crash.

2 Create optimization

2.1 Schema data types and optimization

Integer

TinyInt, SmallInt, MediumInt, Int, BigInt 8,16,24,32,64 bits stored using storage space. Unsigned if not allowed to use negative numbers, you can make a positive number on line doubling.

Real

  • Float, Double, support approximate floating-point operations.

  • Decimal, for storing the decimal precision.

String

  • VarChar, storing variable-length strings. It requires 1 or 2 additional record length byte strings.

  • Char, fixed length, for storing fixed-length character string, such as MD5 value.

  • Blob, Text to store a lot of data and design. Respectively, and by way of binary characters.

Time Type

  • The DateTime, save a large range of values, 8 bytes.

  • TimeStamp, recommended the same UNIX timestamp, 4 bytes.

Optimization tips point

  • To make use of the corresponding data type. For example, do not save time with a string type, with integral save IP.

  • Choose smaller data types. TinyInt can not Int.

  • Identity column (identifier column), recommend the use of integer, string type is not recommended, take up more space, and computing speed slower than integer.

  • ORM is not recommended automatically generated Schema, generally have not focused on data type, use a large VarChar type, index irrational use and other issues.

  • Real scene mix paradigm and counter-paradigm. Redundant high query efficiency, low insertion update efficiency; low insertion redundant high update efficiency, low query efficiency.

  • Create summary completely independent cache table, the timing to generate data, the user takes a long time for the operation. For high accuracy requirements summary operations, you can use the historical record of the results of the latest results + to achieve the purpose of fast query.

  • Data migration, the upgrade process table can use the shadow table, the table name by modifying the original table, to the preservation of historical data, while not affecting the purpose of the new table to use.

2.2 Index

Index value comprising one or more columns. MySql only the most efficient use of the prefix left column of the index. Index of advantages:

  • Reduce the amount of data query scans

  • Avoid ordering and table zero

  • The order becomes random IO IO (IO efficiency is higher than random order IO)

B-Tree

Most used type of index. B-Tree data structure employed to store data (each leaf node contains a pointer to the next leaf node, thereby facilitating traversal of leaf nodes). B-Tree index applicable to the whole key, key range, key prefix lookup, supports sorting.

B-Tree indexing restrictions:

  • If it is not in accordance with the leftmost column index began to query, you can not use the index.

  • You can not skip columns in the index. If you use the first column and the third column index, you can only use the first column of the index.

  • If there is a range query query, all the columns to its right can not use indexes to optimize queries.

Hash indexes

All columns that only an exact match of the index, the query is valid. All storage engine will calculate a hash code index columns, all the hash index hash code stored in the index, and stores a pointer to each data line.

Hash indexes restrictions:

  • It can not be used for sorting

  • It does not support partial matches

  • The only support the equivalent query =, IN (), does not support <>

Optimization tips point

  • Note that the scope and limitations of each applicable index.

  • If the column index is part of the expression or function of the parameters, then the failure.

  • For particularly long string prefix index may be used to select the appropriate prefix length with selectivity index.

  • When using a multi-column index, can be connected by AND and OR grammar.

  • Duplicate index is not necessary, such as (A, B), and (A) is repeated.

  • When the index is particularly effective in conditions where queries and group by syntax of the query.

  • The scope of the inquiry on the conditions of the last query, the query scope to prevent problems caused by the failure of the right index.

  • Index is best not to select the string is too long, and the index column should not be null.

Beautifully learning video sharing:

Links: https://pan.baidu.com/s/1v5gm7n0L7TGyejCmQrMh2g extraction code: x2p5

free to share, but serious limitations of X, should click on the link or links fail Search plus population group number 936 682 608 .

3 optimized query

Three important indicators of the quality of the inquiry 3.1

  • Response time (service time, queue time)

  • Scanning line

  • Return line

3.2 query optimization point

  • Avoid queries unrelated columns, such as the use Select * to return all columns.

  • Avoid row of the query-independent

  • Segmentation query. The task of a larger pressure on the server, to decompose in a long time, and in multiple execution. To delete ten thousand data points can be executed 10 times, each time after the completion of the implementation period of pause, and then continue. Process server resources can be released to other tasks.

  • Decomposition associated with the query. The first multi-table queries associated with the query, decomposed into multiple single-table queries. Can reduce lock contention, query efficiency of the query itself is relatively high. MySql because the connection and disconnection operations are lightweight, not because the query is split into several times, resulting in efficiency.

  • Note that the operation can only count statistics is not null column, so the total count of the number of lines to use count (*).

  • group by column in accordance with the identification of high packing efficiency, the results of the packet should travel beyond the column grouping columns.

  • Relational query associated delay, can be queried to narrow their search criteria based, reassociation.

  • Limit page optimization. Scanning can be covered according to the index, and then query other columns associated itself according to the index column. Such as

     

  1. SELECT
  2.   id,
  3.   NAME,
  4.  age
  5. WHERE
  6.  student s1
  7. INNER JOIN (
  8.   SELECT
  9.       id
  10.   FROM
  11.      student
  12.   ORDER BY
  13.      age
  14.   LIMIT 50,5
  15. AS s2 ON s1.id = s2.id

 

  • Union queries to default weight, if not the business must be recommended more efficient use of Union All


to add on

 

Table Structure Type field types and conditions 1. inconsistent, mysql added automatically transfer function, resulting in the index as a function of the parameters of the failure.

2.like not enter a query in the previous section, beginning with% could not hit the index.

3. complement two new features in version 5.7:

generated column, the column is calculated from the database of the other columns to give

  1. CREATE TABLE triangle (sidea DOUBLE, sideb DOUBLE, area DOUBLE AS (sidea * sideb / 2));
  2. insert into triangle(sidea, sideb) values(3, 4);
  3. select * from triangle;

 

  1. +-------+-------+------+
  2. | sidea | sideb | area |
  3. +-------+-------+------+
  4. |   3      |   4      |  6     |
  5. +-------+-------+------+

It supports JSON format data, and provide built-in functions

  1. CREATE TABLE json_test (name JSON);
  2. INSERT INTO json_test VALUES('{"name1": "value1", "name2": "value2"}');
  3. SELECT * FROM json_test WHERE JSON_CONTAINS(name, '$.name1');

 

Concerned about the use explain in Performance Analysis

EXPLAIN SELECT settleId FROM Settle WHERE settleId = "3679" 

 

SQL performance optimization comb

 

  • SELECT_TYPE , there are several values: simple (simple select representation, no union and sub-query), primary (there are sub-queries, the outermost select query is the primary), union (second or subsequent select the query in the union, not results rely on external), dependent union (union in a second or subsequent select query, the query results rely on external)

  • of the type , there are several values: system (tables only one row (= system table), which is a special case const connection type), const (constant query), ref (non-unique index to access only the general index), eq_ref (use components unique index or query), all (table queries), index (whole query the index table), range (range queries)

  • possible_keys : table may help index query

  • Key , choose to use index

  • The key_len , using index length

  • rows , the number of lines scanned, the greater the worse

  • Extra , there are several values: Only index (retrieved from the index information, is faster than the scan table), where time Used (where use restrictions), Using filesort (may sort memory or disk), Using temporary (order query results use temporary tables)

Guess you like

Origin www.cnblogs.com/it-chen/p/11688753.html