Sql server Detailed Implementation Plan

Preface

Data in this chapter has two purposes:

1, understand t-sql execution plan, some common sense to understand the implementation plan.

2, able to analyze the execution plan, find sql performance optimization ideas or programs.

If you are not very in-depth understanding of query optimization sql or common sense, it is recommended that a few Bowen lie to you: SqlServer performance testing and optimization tools use detailed  , optimized sql statement analysis , T-sql statement to query execution order .

About execution plan

1. What is the implementation plan?

Big Brother submitted sql statement, the database query optimizer, after analysis and efficient way to execute a query generates multiple databases can be identified. Then the optimizer will find many execution plan to use a minimum of resources, but not the fastest implementation of the program, to show you out, you can be xml format, text format, can also be a graphical implementation of the program.

2, estimated execution plan, the actual execution plan

Select statement, clicking on one of the implementation plan, estimated execution plan can be displayed immediately, while the actual execution plan will need to perform sql statement appears. Estimated implementation plan is not equal to the actual execution plan, but the actual execution plan with estimated execution plan in most cases are the same. Case statistics change or recompile the execution plan, etc., can cause different.

3, why should read the Plan of Implementation

First, the implementation of plans to let you know that your complex sql in the end is how the implementation, there is not performed in accordance with the program you want, there is no executed in the most efficient way to use it many indexes which one, how to sort, how merging data, there is no unnecessary waste of resources and so on. Official data show that the implementation problems t-sql, 80% can find the answer in the execution plan.

4, for the graphical execution plan analysis

Execution plan, we can show it to text, xml, graphic. This lie mainly in the graphical execution plan led to analyze, but the execution plan includes 78 operators available herein, this can only be used for analysis, commonly includes almost all of your daily. Msdn with pictures Introduction: https://msdn.microsoft.com/zh-cn/library/ms175913(v=sql.90).aspx

5, how to see the execution plan

Graphical execution plan from top to bottom and left to see.

6, clear the cache execution plans

dbcc freeprocache

dbcc flushprocindb(db_id)

Understand graphical execution plan

1, the connection

1, a scanning line number the more coarse the more affected.

2, the number of scanning lines of Rows Actual Number of actual impact.

3, Estimated Number of Rows estimate the number of lines scanned impact.

4, Estimated row size line operator generates estimated size (bytes).

5, size Estimated Data Size estimate the impact of the data.

2, Tooltips, the current step execution information

  

Note: The tips of information to tell us what the object to perform that operation takes what action is, what data lookup is, what is the use of the index, sorting or not, estimated cpu, I / O, affecting the number of rows, the actual the number of rows and other information. See a list of the specific parameters of the MSDN: https://msdn.microsoft.com/zh-cn/library/ms178071(v=sql.90).aspx

3, Table Scan (table scan)

In the case when the table is not clustered index, but no suitable index, this operation will appear. This operation is a waste of performance, his presence also means that the optimizer to traverse an entire table to find the data you need.

4, Clustered Index Scan (clustered index scan), Index Scan (non-clustered index scan)

 

This icon can be used two operations, a clustered index scan, a non-clustered index scan.

Clustered Index Scan: clustered index is the actual volume of data that the table itself, that is to say the number of table rows number of columns, gather all the columns how many lines there are that many, then the clustered index table scan scan just about the same, but also a full table scan traversing all table data, you find data you want.

Non-Clustered Index Scan: Volume index of non-clustered index is created based on your circumstances, you can include only columns you want to query. Then the non-clustered index scan, all rows of the column is that you included in the non-clustered traverse to find out the data you want.

5, Key Lookup (key lookup)

首先需要说的是查找,查找与扫描在性能上完全不是一个级别的,扫描需要遍历整张表,而查找只需要通过键值直接提取数据,返回结果,性能要好。

当你查找的列没有完全被非聚集索引包含,就需要使用键值查找在聚集索引上查找非聚集索引不包含的列。

6、RID Lookoup(RID查找)

 

跟键值查找类似,只不过RID查找,是需要查找的列没有完全被非聚集索引包含,而剩余的列所在的表又不存在聚集索引,不能键值查找,只能根据行表示Rid来查询数据。

7、Clustered Index Seek(聚集索引查找)、Index Seek(非聚集索引查找)

聚集索引查找和非聚集索引查找都是使用该图标。

聚集索引查找:聚集索引包含整个表的数据,也就是在聚集索引的数据上根据键值取数据。

非聚集索引查找:非聚集索引包含创建索引时所包含列的数据,在这些非聚集索引的数据上根据键值取数据。

8、Hash Match

 

这个图标有两种地方用到,一种是表关联,一种是数据聚合运算时。

再分别说这两中运算的前面,我先说说Hashing(编码技术)和Hash Table(数据结构)。

Hashing:在数据库中根据每一行的数据内容,转换成唯一符号格式,存放到临时哈希表中,当需要原始数据时,可以给还原回来。类似加密解密技术,但是他能更有效的支持数据查询。

Hash Table:通过hashing处理,把数据以key/value的形式存储在表格中,在数据库中他被放在tempdb中。

接下来,来说说Hash Math的表关联跟行数据聚合是怎么操作运算的。

表关联:

如上图,关联两个数据集时,Hash Match会把其中较小的数据集,通过Hashing运算放入HashTable中,然后一行一行的遍历较大的数据集与HashTable进行相应的匹配拉取数据。

数据聚合:当查询中需要进行Count/Sum/Avg/Max/Min时,数据可能会采用把数据先放在内存中的HashTable中然后进行运算。

9、Nested Loops

这个操作符号,把两个不同列的数据集汇总到一张表中。提示信息中的Output List中有两个数据集,下面的数据集(inner set)会一一扫描与上面的数据集(out set),知道扫描完为止,这个操作才算是完成。

10、Merge Join

这种关联算法是对两个已经排过序的集合进行合并。如果两个聚合是无序的则将先给集合排序再进行一一合并,由于是排过序的集合,左右两个集合自上而下合并效率是相当快的。

11、Sort(排序)

对数据集合进行排序,需要注意的是,有些数据集合在索引扫描后是自带排序的。

12、Filter(筛选)

根据出现在having之后的操作运算符,进行筛选

13、Computer Scalar

 

在需要查询的列中需要自定义列,比如count(*) as cnt ,select name+''+age 等会出现此符号。

根据执行计划细节要做的优化操作

这里会有很多建议给出,我不一一举例了,给出几个示例,想做到优化行家,多的还需要大家去悟去理解。

1、如果select * 通常情况下聚集索引会比非聚集索引更优。

2、如果出现Nested Loops,需要查下是否需要聚集索引,非聚集索引是否可以包含所有需要的列。

3、Hash Match连接操作更适合于需要做Hashing算法集合很小的连接。

4、Merge Join时需要检查下原有的集合是否已经有排序,如果没有排序,使用索引能否解决。

5、出现表扫描,聚集索引扫描,非聚集索引扫描时,考虑语句是否可以加where限制,select * 是否可以去除不必要的列。

6、出现Rid查找时,是否可以加索引优化解决。

7、在计划中看到不是你想要的索引时,看能否在语句中强制使用你想用的索引解决问题,强制使用索引的办法Select CluName1,CluName2 from Table with(index=IndexName)。

8, see the connection algorithm is not what you want, you want to try to force the use of algorithms to solve the problem. Forced to use the connection algorithm statement: select * from t1 left join t2 on t1.id = t2.id option (Hash / Loop / Merge Join)

9, see the aggregation algorithm is not what you want, try to force the use of aggregation algorithm you want. Statement Example aggregation algorithm to enforce: select age, count (age) as cnt from t1 group by age option (order / hash group)

10, see the order of execution is not what you want to resolve, or when consuming this resolution order is too large, try to force you to use the given execution order. option (force order)

11, see multiple threads to run concurrently affect your sql statement when the performance is not trying to force parallel operation. option (maxdop 1)

12, during storage, due to the different parameters result in different execution plans, it also affects the performance parameters specified when trying to optimize. option (optiomize for (@ name = 'zlh'))

13, the redundant column is not operated, extra lines, not important to the polymerization sorting.

 

Source: https://www.cnblogs.com/knowledgesea/p/5005163.html

Guess you like

Origin www.cnblogs.com/Tony100/p/10967357.html