mysql tuning one

mysql tuning one

Preparation before tuning, this article includes three parts of SQL performance monitoring, data type optimization and execution plan usage.



1. Performance View

1、show profile

This tool is disabled by default and can be set to open. This tool can roughly view the execution time and resource consumption of SQL, which is more convenient.
Set and use in the command line:

set profiling=1;  //打开查询,设置完成之后,在服务器上执行的所有语句,都会测量其耗费的时间和其他一些查询执行状态变更相关的数据。
select * from 表 .....;  //执行具体的相关sql语句
show profiles;  //查看具体sql的执行时间
show profile for query 1;  //最后面的1对应上面show profiles的id号,查看详细每个步骤的耗时
show profile all for query 1; //ALL这可以配置需要查看的信息,ALL为查看占用资源的全部信息。

Set and use in the query window

SHOW VARIABLES LIKE 'profiling';  -- 查看开关的打开状态
SET profiling=ON; -- 打开开关
select * from 表 .....;  -- 执行具体的相关sql语句
SHOW PROFILES;  -- 查看具体sql的执行时间
SHOW PROFILE ALL FOR QUERY 1;  -- 最后面的1对应上面show profiles的id号,ALL这可以配置需要查看的信息,ALL为查看占用资源的全部信息。

Attach ALL related information that can be viewed

block io:显示块io操作的次数
context switches:显示上下文切换次数,被动和主动
cpu:显示用户cpu时间、系统cpu时间
IPC:显示发送和接受的消息数量
page faults:显示页错误数量
source:显示源码中的函数名称与位置
swaps:显示swap的次数

2、performance schema

It is equivalent to the enhanced version of monitoring mysql's resource consumption and waiting situation during the running process. It is more detailed and can be based on the monitored data performance_schema library to make a product for monitoring database information. In 5.7, the function is turned on by default.
For details on features and configuration and usage, please refer to: Detailed explanation of performance schema usage
More practical usage:

-- 哪类的SQL执行最多
SELECT DIGEST_TEXT,COUNT_STAR,FIRST_SEEN,LAST_SEEN FROM events_statements_summary_by_digest ORDER BY COUNT_STAR DESC
-- 哪类SQL的平均响应时间最多
SELECT DIGEST_TEXT,AVG_TIMER_WAIT FROM events_statements_summary_by_digest ORDER BY COUNT_STAR DESC
-- 哪类SQL排序记录数最多
SELECT DIGEST_TEXT,SUM_SORT_ROWS FROM events_statements_summary_by_digest ORDER BY COUNT_STAR DESC
-- 哪类SQL扫描记录数最多
SELECT DIGEST_TEXT,SUM_ROWS_EXAMINED FROM events_statements_summary_by_digest ORDER BY COUNT_STAR DESC
-- 哪类SQL使用临时表最多
SELECT DIGEST_TEXT,SUM_CREATED_TMP_TABLES,SUM_CREATED_TMP_DISK_TABLES FROM events_statements_summary_by_digest ORDER BY COUNT_STAR DESC
-- 哪类SQL返回结果集最多
SELECT DIGEST_TEXT,SUM_ROWS_SENT FROM events_statements_summary_by_digest ORDER BY COUNT_STAR DESC
-- 哪个表物理IO最多
SELECT file_name,event_name,SUM_NUMBER_OF_BYTES_READ,SUM_NUMBER_OF_BYTES_WRITE FROM file_summary_by_instance ORDER BY SUM_NUMBER_OF_BYTES_READ + SUM_NUMBER_OF_BYTES_WRITE DESC
-- 哪个表逻辑IO最多
SELECT object_name,COUNT_READ,COUNT_WRITE,COUNT_FETCH,SUM_TIMER_WAIT FROM table_io_waits_summary_by_table ORDER BY sum_timer_wait DESC
-- 哪个索引访问最多
SELECT OBJECT_NAME,INDEX_NAME,COUNT_FETCH,COUNT_INSERT,COUNT_UPDATE,COUNT_DELETE FROM table_io_waits_summary_by_index_usage ORDER BY SUM_TIMER_WAIT DESC
-- 哪个索引从来没有用过
SELECT OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME FROM table_io_waits_summary_by_index_usage WHERE INDEX_NAME IS NOT NULL AND COUNT_STAR = 0 AND OBJECT_SCHEMA <> 'mysql' ORDER BY OBJECT_SCHEMA,OBJECT_NAME;
-- 哪个等待事件消耗时间最多
SELECT EVENT_NAME,COUNT_STAR,SUM_TIMER_WAIT,AVG_TIMER_WAIT FROM events_waits_summary_global_by_event_name WHERE event_name != 'idle' ORDER BY SUM_TIMER_WAIT DESC
-- 剖析某条SQL的执行情况,包括statement信息,stege信息,wait信息
SELECT EVENT_ID,sql_text FROM events_statements_history WHERE sql_text LIKE '%count(*)%';
-- 查看每个阶段的时间消耗
SELECT event_id,EVENT_NAME,SOURCE,TIMER_END - TIMER_START FROM events_stages_history_long WHERE NESTING_EVENT_ID = 具体id;
-- 查看每个阶段的锁等待情况
SELECT event_id,event_name,source,timer_wait,object_name,index_name,operation,nesting_event_id FROM events_waits_history_longWHERE nesting_event_id = 具体id;

3、show processlist

Used to view the number of connected threads, but now they are all using thread pools, so this command is not commonly used.

id表示session id
user表示操作的用户
host表示操作的主机
db表示操作的数据库
command表示当前状态(sleep:线程正在等待客户端发送新的请求,query:线程正在执行查询或正在将结果发送给客户端,
locked:在mysql的服务层,该线程正在等待表锁,analyzing and statistics:线程正在收集存储引擎的统计信息,
并生成查询的执行计划,Copying to tmp table:线程正在执行查询,并且将其结果集都复制到一个临时表中,
sorting result:线程正在对结果集进行排序,sending data:线程可能在多个状态之间传送数据,或者在生成结果集或者向客户端返回数据)
info表示详细的sql语句
time表示相应命令执行时间
state表示命令执行状态

2. Basic optimization

1. Data type optimization:

 1> Try to use the smallest data type that can store data correctly (occupies less disk, memory and CPU cache, and requires less CPU cycles for processing).
 2> Try to use simple data types (1, integers are cheaper than character operations, because character sets and collation are more complicated than integer comparisons. 2, use mysql self-built types instead of strings to store dates and Time. 3. How to use integer storage IP address is detailed below)
 3> Try to avoid field null (columns that can be null make indexes, index statistics and value comparison more complicated)
 4> Details of each data type:
 integer type : According
to the demand, try to use the smallest data type. TINYINT, SMALLINT, MEDIUMINT, INT, BIGINT use 8, 16, 24, 32, 64-bit storage space

 characters and string types respectively:
varchar saves data according to the actual content length, and uses the smallest length that meets the requirements.
Application scenarios: a. Data with large fluctuations in storage length; b. Scenarios where strings are rarely updated. After each update, they will be recalculated and use additional storage space to save the length; c. Multibyte characters;
varchar(10) and varchar(255) saves the same content, the hard disk storage space is the same, but the memory space occupation is different, it is the specified size. If the specified length is less than or equal to 255, an extra byte is used to save the length, and greater than 255, an extra two bytes are used to save the length. Changing the length before mysql5.6 will cause the table to be locked.
Char is a fixed-length character string with a maximum length of 255. The space at the end will be automatically deleted. The retrieval efficiency and writing efficiency will be higher than that of varchar, and space is used for time.
Application scenarios: a. Store data with little fluctuation in length; b. Store short strings and frequently updated strings;

 BLOB and TEXT types:
Both treat the value as an independent object, and the string type that exists to store large data is stored in binary and character respectively.

 Date and time type:
Note: Do not use a string to store the date type, which takes up a lot of space and loses the convenience of the date type function.
datetime: occupies 8 bytes; it has nothing to do with the time zone. The underlying time zone configuration of the database is invalid for datetime; it can be saved to milliseconds; The time range that can be saved is large;
timestamp: occupies 4 bytes; the time range is 1970-01-01 to 2038-01-19; accurate to the second; using plastic storage; relying on the time zone set by the database; automatically updating the value of the timestamp column.
date: The number of bytes occupied is less than that of string, datetime, and int storage. The date type only needs 3 bytes; the date type can also be used to calculate between dates using date and time functions; the date type is used for storage Date between 1000-01-01 and 9999-12-31.

 Enumeration class
application scenario: configuration class translation class field can use enumeration class instead of commonly used string type; mysql internally saves the position of each value in the list as an integer, and saves it in the .frm file of the table " The lookup table of the number-string mapping relationship. create table enum_type(type enum('0','1','2') not null); insert into enum_type(type) values('invalid'),('in force'),('pre-effective');
select type +0 from enum_type;

Special type: As
mentioned above, the storage address uses INT to store the ip address (essentially a 32-bit unsigned integer), and the INET_ATON() and INET_NTOA functions are used to switch between these two representation methods. .
Example: INET_ATON ('192.168.22.2') given INET_NTOA (3232241154)

2. Paradigm

 Paradigm and anti-paradigm are combined to use
 paradigm:
advantages: a. Normalization is usually updated faster than anti-paradigm; b. When the data is better normalized, there is little or no repeated data; c. Normalized data is relatively small , Can be placed in memory, operation is relatively fast;     
Disadvantages: generally need to carry out a lot of associations;
 Anti-paradigm:
advantages: a. All data are in the same table, which can avoid association; b. An effective index can be designed;    
Disadvantages: in the table There is more redundancy, and some useful information in the table will be lost when deleting data;

3. Primary key

 Primary keys can be divided into two types of proxy primary keys (non-business-related, meaningless numeric sequences) and natural primary keys (naturally unique identifiers in the attributes of things).
It is recommended to use surrogate primary keys: they are not coupled with the business, so they are easier to maintain; a most table, preferably all tables, a common key strategy can reduce the amount of source code that needs to be written and reduce the total cost of ownership of the system.

4. Character set

  The same content uses different character sets to indicate that the space occupied will have a large difference; using a suitable character set can reduce the amount of data as much as possible, thereby reducing the number of IO operations.
a. If pure Latin characters can express all content, choose latin1, which will save a lot of storage space.
b. Do not need to store multiple languages, try not to use UTF8 or other UNICODE character types to save storage space.
c. The data type of MySQL can be accurate to the field. You can use different data types for different tables and different fields to greatly reduce the amount of data storage, thereby reducing the number of IO operations and increasing the cache hit rate.

5. Storage engine

   Three commonly used InnoDB, MyISAM, Memory (based on memory).
InnoDB is generally used, and the specific differences are detailed later. ![Storage Engine Comparison](https://img-blog.csdnimg.cn/20210105105939241.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNblt80,color=FF_FF_0FF_6,OT_FF_6,OT_F6, Data redundancy/split   a. For independent small fields that are frequently quoted and can only be obtained by joining multiple large tables, appropriate data redundancy can be done. The same field exists in multiple tables, and space is exchanged for time .
b. For large fields of TEXT or very large VARCHAR type, if the field is rarely used in the query, the field needs to be split into a separate table to reduce the storage space occupied by common data. It can increase the number of data stored in the data block, reduce the number of physical IOs, and increase the cache hit rate in the memory.

Three, implementation plan

Use EXPLAIN + sql statement to view the execution plan of a specific sql statement and see how to process the sql statement.

1. Field details

ID:
The order in which the select clause or operation table is executed in the statement
a. If the id is the same, the execution order is from top to bottom; b. If the id is different, the larger the id value, the higher the priority, and the earlier it will be executed.

SELECT_TYPE:
used Distinguish the type of query, the meaning of each value is as follows:
a, sample: simple query, does not include sub-query and union; b, primary: if the query contains any complex sub-query, the outermost query is marked as Primary ; C, union: if the second select appears after union, it will be marked as union; d, dependent union: depentent means that the result of union or union all will be affected by the external table; e, union result: from union The select of the table to obtain the result; f, subquery: include subquery in the select or where list; g, dependent subquery: the subquery of the subquery is affected by the query of the external table; h, DERIVED: the subquery that appears in the from clause;

TABLE:
Which table the corresponding row is accessing, the table name or alias, may be a temporary table or a union result set.
a. If it is a specific table name, it indicates that the data is obtained from the actual physical table; b. The table name is in the form of derivedN, which means that the derived table generated by the query with id N is used; c. The table name is union n1, The form of n2..., n1, n2... represents the id

TYPE of the union :
Access type, the efficiency of the access type is from good to bad: system> const> eq_ref> ref> fulltext> ref_or_null> index_merge> unique_subquery> index_subquery> range> index> ALL.
Generally speaking, it should be at least range level, preferably ref.
all: full table scan. Under normal circumstances, if such SQL statements appear and the amount of data is relatively large, then optimization needs to be performed;
index: full index scan, the data required by the covering index during query can be obtained in the index or the index is used Sorting;
range: indicates that the range is limited when indexing is used, and the full index scan of the index is avoided. =, <>, >, >=, <, <=, IS NULL, BETWEEN, LIKE, or IN(), etc.
index_subquery: use the index to associate subqueries, no longer scan the entire table
unique_subquery: similar to index_subquery, use is Unique index
index_merge: multiple indexes are required in the query process.
ref: non-unique index is used for data search
eq_ref: unique index is used for data search
const: the table has at most one matching row
system: the table has only one row record


POSSIBLE_KEYS:
The indexes that may be used in this table are not necessarily used.

KEY:
The index actually used. If it is null, the index is not used.

KEY_LEN:
Indicates the number of bytes used in the index, and the length of the index used in the query is calculated by this value.

REF:
Shows which column of the index is used.

ROWS:
Roughly estimate the number of rows that need to be read to find the required record. The smaller the value is, the better when the requirement is fulfilled.

Guess you like

Origin blog.csdn.net/weixin_49442658/article/details/112180149
Recommended