Mysql gap lock (gap lock) and slow query


Previous article-MySql-lock and things

Gap lock and slow query

gap lock

In fact, in mysql, repeatable reading has solved the problem of phantom reading, with the help of gap lock

Experiment 1:

select @@tx_isolation; 
create table t_lock_1 (a int primary key);
insert into t_lock_1 values(10),(11),(13),(20),(40);
begin select * from t_lock_1 where a <= 13 for update; 
  

In another session insert into t_lock_1 values(21)successfully insert into t_lock_1 values(19)blocked

In the rr isolation level, one of them will scan to the next value (20) of the current value (13), and lock all these data

Experiment: 2

create table t_lock_2 (a int primary key,b int, key (b)); 
insert into t_lock_2 values(1,1),(3,1),(5,3),(8,6),(10,8);

会话 1
BEGIN
select * from t_lock_2 where b=3 for update;

1 3 5 8 10
1 1 3 6 8
Session 2

select * from t_lock_2 where a = 5 lock in share mode; -- 不可执行,因为 a=5 上有一把记录锁 
insert into t_lock_2 values(4, 2); -- 不可以执行,因为 b=2 在(1, 3]内 
insert into t_lock_2 values(6, 5); -- 不可以执行,因为 b=5 在(3, 6)内 
insert into t_lock_2 values(2, 0); -- 可以执行,(2, 0)均不在锁住的范围内 
insert into t_lock_2 values(6, 7); -- 可以执行,(6, 7)均不在锁住的范围内 
insert into t_lock_2 values(9, 6); -- 可以执行 
insert into t_lock_2 values(7, 6); -- 不可以执行

The locked range of the secondary index is (1, 3], (3, 6) the
primary key index only locks the record with a=5 [5]

Transaction grammar

Open transaction

1、begin
2、START TRANSACTION(推荐)
3、begin work

Transaction rollback

rollback

Transaction commit

commit

Restore point (demo)

savepoint 
show variables like '%autocommit%'; 自动提交事务是开启的 
set autocommit=0; insert into testdemo values(5,5,5); 
savepoint s1; 
insert into testdemo values(6,6,6); 
savepoint s2; 
insert into testdemo values(7,7,7); 
savepoint s3; 
select * from testdemo 
rollback to savepoint s2 rollback

Business design

Logical design

Paradigm design

The first paradigm of database design.
All fields in a database table have only a single attribute
. The columns of a single attribute are composed of basic data types. The
designed tables are simple two-dimensional tables. The
Insert picture description here
name-age column has two attributes. A name and an age do not conform to the first paradigm. Split it into two columns
Insert picture description here
. The second paradigm of the database design
requires only one business primary key in the table, that is to say, the table conforming to the second paradigm cannot have non-primary key columns. Part of the primary key dependency relationship
There are two tables: order table, product table,
Insert picture description here
Insert picture description here
one order has multiple products, so the primary key of the order is the combined primary key composed of [order ID] and [product ID], so the two components do not conform to the second normal form , And the product ID and order ID are not strongly related. Therefore, the order table is split into the order table and the intermediate table of orders and commodities.
Insert picture description here
The third paradigm of database design
means that every non-primary attribute is neither partially dependent on nor Transit depends on the business primary key, that is, on the basis of the second paradigm, we get along with the non-primary key transitive dependence on the primary key.
Insert picture description here
The customer number and the order number are associated with the
customer name and the order number. The
customer number and the customer name are associated
if the customer number changes. , The user’s name will also change, which does not meet the third paradigm, the customer name column should be deleted

Query test

Write SQL to query the total amount of each user's order (user name, total order amount)
Insert picture description here
Write SQL to query the user and order details (order number, user name, mobile phone number, product name, product quantity,
product price)
Insert picture description here
questions: A large number of table associations greatly affect the performance of the query. It is in
full compliance with the normalized design. Sometimes good SQL query performance cannot be obtained.

Anti-paradigm design

What is anti-paradigm design

  • Anti-paradigmization is aimed at paradigmization. The database design paradigm was introduced in the previous section.
  • The so-called de-normalization is to violate the requirements of the database design paradigm for performance and read efficiency considerations.
  • A small amount of redundancy is allowed, in other words de-normalization is the use of space in exchange for time

to sum up

We can not be completely designed according to the requirements paradigm was
how to use the table after consideration

Advantages and disadvantages of paradigm design

Advantages:
Data redundancy can be reduced as much as possible.
Normalized update operations are faster than de-normalized
tables. Normalized tables are usually smaller than de-normalized tables.
Disadvantages:
For queries, multiple tables need to be associated. It is
more difficult to perform index optimization.

Advantages and disadvantages of anti-paradigm design

Advantages:
can reduce table associations, and
can better optimize indexing.
Disadvantages:
data redundancy and abnormal data maintenance.
Data modification requires more cost

Physical design

Naming conventions
The naming of databases, tables, and fields must comply with the principle of readability

Use case to format library object names for good readability.
For example, use custAddress instead of custaddress to improve readability.
The naming of databases, tables, and fields should comply with the principle of ideographic meaning.
The name of the object should be able to describe the object it represents.
For example: For a table, the name of the table should be able to reflect the content of the data stored in the table;
for a stored procedure, the stored procedure should be able to reflect the storage The function of the process.
The naming of databases, tables, and fields should comply with the long name principle
. Use as little or no abbreviation
storage engine as possible to choose
Insert picture description here
data type selection
When a column can choose multiple data types  Priority is given to numeric types  Next is date and time types  Finally It is a character type  For data types of the same level, the data type that occupies a small space should be preferred. The
floating-point type
Insert picture description here
date type.
Interviews often ask the difference between the timestamp type and the datetime. The
datetime type is 5 bytes in 5.6. The
datetime type is 5.5. The field length is 8 bytes.
Timestamp is related to time zone, but datetime has nothing to do.

Slow query

What is slow query

The slow query log, as the name implies, is the slow query log, which means that mysql records all the SQL statements that execute over the time threshold set by the long_query_time parameter. This log can bring great help to the optimization of SQL statements. By default, the slow query log is turned off. To use the slow query log function, you must first enable the slow query log function.

Slow query configuration

Slow query basic configuration

  • slow_query_log Start stop technology slow query log
  • slow_query_log_file specifies the storage path and file of the slow query log (the default is to put it together with the data file)
  • long_query_time Specifies to record the SQL execution time of the slow query log (unit: second, default 10 seconds)
  • log_queries_not_using_indexes Whether to log SQL that does not use indexes
  • log_output The place where the log is stored [TABLE] [FILE] [FILE, TABLE]
    After the slow query is configured, it will record the qualified SQL
    including:
  • Check for phrases
  • Data modification statement
  • SQL practice has been rolled back
    :
    Check the above configuration with the following command:
 show VARIABLES like '%slow_query_log%' 
 show VARIABLES like '%slow_query_log_file%' 
 show VARIABLES like '%long_query_time%' 
 show VARIABLES like '%log_queries_not_using_indexes%' 
 show VARIABLES like 'log_output' 
 set global long_query_time=0; ---默认 10 秒,这里为了演示方便设置为 0 
 set GLOBAL slow_query_log = 1; --开启慢查询日志 
 set global log_output='FILE,TABLE' --项目开发中日志只能记录在日志文件中,不能记表中 

After the setting is complete, query some lists to find that there is data in the slow query log file.
cat /usr/local/mysql/data/mysql-slow.log
Insert picture description here

Slow query interpretation

A slow query log excerpt, the following composition data from inside the slow query log
Insert picture description here
interpretation, slow query format
Insert picture description here
First line: the user name, IP information of the user, thread ID number
of the second line: the time spent in execution [Unit: ms]
of Three lines: the execution time to obtain the lock. The
fourth line: the number of result rows obtained. The
fifth line: the number of scanned data lines. The
sixth line: the specific time of the SQL execution. The
seventh line: the specific SQL statement.

Slow query analysis

There are a lot of slow query log records. It is not easy to find a slow query log from it. Generally speaking, some tools are needed to quickly locate the SQL statement that needs to be optimized. Here are two slow query auxiliary tools

Mysqldumpslow

The commonly used slow query log analysis tool summarizes the same SQL except for the query conditions, and outputs the analysis results in the order specified in the parameters.
Syntax:
mysqldumpslow -sr -t 10 slow-mysql.log
-s order (c,t,l,r,at,al,ar)
    c: total number of times
    t: total time
    l: lock time
    r: total data row
    at ,al,ar: t,l,r average number [for example: at = total time/total number of times]
-t top specifies the previous few days as the result output

mysqldumpslow -s t -t 10 /usr/local/mysql/data/mysql-slow.log

Insert picture description here

pt_query_digest

It is a tool for analyzing the slow query of mysql. Compared with the mysqldumpshow tool, the analysis result of the py-query_digest tool is more specific and complete.
Sometimes for some reasons, such as insufficient permissions, etc., the query cannot be recorded on the server.
We often encounter such restrictions. First look at the next command
package to share
Extract code: s0tl

yum -y install 'perl(Data::Dumper)'; 
yum -y install perl-Digest-MD5 
yum -y install perl-DBI
yum -y install perl-DBD-MySQL
perl ./pt-query-digest --explain h=127.0.0.1,u=root,p=root1234% /usr/local/mysql/data/mysql-slow.log

Insert picture description here
Summarized information [Total query time], [Total lock time], [Total acquired data volume], [Scanned data volume], [Query size]
Response: Total response time.
time: The total time percentage of the query in this analysis.
calls: The number of executions, that is, how many queries of this type are there in this analysis.
R/Call: Average response time per execution.
Item: Query object

Extended reading:

Syntax and important options
pt-query-digest [OPTIONS] [FILES] [DSN]

  • --Create-review-table When using the --review parameter to output the analysis results to the table, it will be created automatically if there is no table.
  • –Create-history-table When the –history parameter is used to output the analysis result to a table, it will be created automatically if there is no table.
  • --Filter The input slow query is matched and filtered according to the specified string before analysis
  • --Limit limits the percentage or number of output results. The default value is 20, that is, the slowest 20 sentences will be output. If it is 50%, the total response time will be sorted from large to small, and the output will stop at the position where the sum reaches 50%.
  • --Host mysql server address  --user mysql username
  • --Password mysql user password  --history save the analysis results to the table, the analysis results are more detailed, next time you use --history, if there are the same statements, and the query time interval is different from the history table, then It will be recorded in the data table, and the historical changes of a certain type of query can be compared by querying the same CHECKSUM.
  • -Review Save the analysis results to the table. This analysis is only to parameterize the query conditions. It is relatively simple to query one record for each type. When using -review next time, if the same sentence analysis exists, it will not be recorded in the data table.
  • --Output The output type of the analysis result, the value can be report (standard analysis report), slowlog (Mysql slow log), json, json-anon, generally use report for easy reading.
  • –Since when to start the analysis, the value is a string, which can be a specified time point in the format "yyyy-mm-dd [hh:mm:ss]", or a simple time value: s (seconds ), h (hour), m (minute), d (day). For example, 12h means the statistics will be started 12 hours ago.
  • -Until deadline, with -since can analyze slow queries in a period of time.

Analyze the output of pt-query-digest

Part 1: Overall Statistics

  • Overall: How many queries are there in total
  • Time range: query execution time range
  • unique: the number of unique queries, that is, how many different queries are there after parameterizing the query conditions
  • total: total min: minimum max: maximum avg: average
  • 95%: Arrange all the values ​​from small to large, the number located at 95%, this number generally has the most reference value
  • median: median, arrange all values ​​from small to large, the number in the middle
# 该工具执行日志分析的用户时间,系统时间,物理内存占用大小,虚拟内存占用大小 
# 340ms user time, 140ms system time, 23.99M rss, 203.11M vsz 
# 工具执行时间 # Current date: Fri Nov 25 02:37:18 2016 # 运行分析工具的主机名 
# Hostname: localhost.localdomain 
# 被分析的文件名 # Files: slow.log 
# 语句总数量,唯一的语句数量,QPS,并发数 
# Overall: 2 total, 2 unique, 0.01 QPS, 0.01x concurrency 
# 日志记录的时间范围 # Time range: 2016-11-22 06:06:18 to 06:11:40 
# 属性 总计 最小 最大 平均 95% 标准 中等 
# Attribute total min max avg 95% stddev median 
# ============ ======= ======= ======= ======= ======= ======= ======= 
# 语句执行时间 
# Exec time 3s 640ms 2s 1s 2s 999ms 1s 
# 锁占用时间 
# Lock time 1ms 0 1ms 723us 1ms 1ms 723us 
# 发送到客户端的行数 
# Rows sent 5 1 4 2.50 4 2.12 2.50 
# select 语句扫描行数 
# Rows examine 186.17k 0 186.17k 93.09k 186.17k 131.64k 93.09k 
# 查询的字符数 
# Query size 455 15 440 227.50 440 300.52 227.50

Part 2: Query group statistics results

  • Rank: The ranking of all sentences, by default in descending order of query time, specified by --order-by
  • Query ID: the ID of the statement, (remove extra spaces and text characters, calculate the hash value)
  • Response: total response time
  • time: the query’s total time in this analysis
  • calls: The number of executions, that is, how many query statements of this type are there in this analysis
  • R/Call: average response time per execution
  • V/M: Variance-to-mean ratio of response time
  • Item: Query object
# Profile # Rank Query ID Response time Calls R/Call V/M Item
# ==== ================== ============= ===== ====== ===== ===============
# 1 0xF9A57DD5A41825CA 2.0529 76.2% 1 2.0529 0.00 SELECT 
# 2 0x4194D8F83F4F9365 0.6401 23.8% 1 0.6401 0.00 SELECT wx_member_base

The third part: the detailed statistical results of each query. From the detailed statistical results
of the query below, the top table lists the execution times, maximum, minimum, average, 95% and other statistics.

  • ID: The ID number of the query, which corresponds to the Query ID in the figure above
  • Databases: database name  Users: the number of times each user executes (proportion)
  • Query_time distribution: query time distribution, the length reflects the proportion of the interval. In this example, the number of queries between 1s and 10s is more than twice that of 10s.
  • Tables: the tables involved in the query
  • Explain: SQL statement
# Query 1: 0 QPS, 0x concurrency, ID 0xF9A57DD5A41825CA at byte 802 
# This item is included in the report because it matches --limit. # Scores: V/M = 0.00 
# Time range: all events occurred at 2016-11-22 06:11:40 
# Attribute pct total min max avg 95% stddev median 
# ============ === ======= ======= ======= ======= ======= ======= ======= 
# Count 50 1 # Exec time 76 2s 2s 2s 2s 2s 0 2s 
# Lock time 0 0 0 0 0 0 0 0 # Rows sent 20 1 1 1 1 1 0 1 
# Rows examine 0 0 0 0 0 0 0 0 # Query size 3 15 15 15 15 15 0 15 # String: 
# Databases test # Hosts 192.168.8.1 # Users mysql # Query_time distribution 
# 1us 
# 10us 
# 100us 
# 1ms 
# 10ms 
# 100ms 
# 1s ################################################################ 
# 10s+ 
# EXPLAIN /*!50100 PARTITIONS*/ 
select sleep(2)\G

Guess you like

Origin blog.csdn.net/weixin_42292697/article/details/114258031