Optimization analysis of sql statement

Straight to the point, where the problem lies

Original address: http://www.cnblogs.com/knowledgesea/p/3686105.html

The performance of the sql statement cannot meet your requirements, and the execution efficiency makes you unbearable. Generally, the following situations occur.

  • Internet speed is not strong and unstable.
  • Not enough server memory, or not enough memory allocated for SQL.
  • sql statement design is unreasonable
  • There is no corresponding index, the index is unreasonable
  • no valid indexed view
  • The table data is too large and there is no effective partition design
  • The database design is too 2, there is a lot of data redundancy
  • The corresponding statistics are missing on the indexed column, or the statistics are out of date
  • ....

So how do we find out what is causing the slow performance?

  • First of all, you need to know whether it is related to the SQL statement, make sure that the machine is not turned on or not, the server hardware configuration is too poor, you say p when there is no network
  • Then you use the 2 Conan SQL performance detection tools mentioned in my last article--sql server profiler to analyze the related statements of slow SQL, that is, the execution time is too long, the system resources are occupied, and the CPU is too much.
  • Then this article is going to talk about sql optimization methods and techniques, avoid some unreasonable sql statements, and take temporary optimal sql
  • Then judge whether to use it, reasonable statistics. In sql server, the data distribution information in the table can be automatically counted, and it is necessary to update the statistical information regularly according to the data situation.
  • Confirm that a reasonable index is used in the table. This index has also been mentioned in my previous blog, but after that blog, I will further write an article on the index
  • Table with too much data, to partition, narrow the search range

Analyze and compare execution time plan reads

select * from dbo.Product

Executing the above statement generally only gives you the returned result and the number of rows executed, so how do you analyze it, and how do you know the difference between your optimization and no optimization?

Here are a few methods for you.

1. View execution time and CPU usage time

set statistics time on
select * from dbo.Product
set statistics time off

You can see it in the message after opening your query.

2. Check the operation of the query on I/0

set statistics io on
select * from dbo.Product
set statistics io off

after execution

 

Scan count: Number of index or table scans

Logical reads: The number of pages read in the data cache

Physical Reads: The number of pages read from disk

Read-ahead: The number of pages put into the cache from disk during the query process

lob logical reads: the number of pages read from the data cache, image, text, ntext or large data

lob physical reads: number of pages read from disk, image, text, ntext or large data

lob read-ahead: During the query process, the number of pages of image, text, ntext or large data that are put into the cache from disk

 

If the number of physical reads and pre-read times are relatively high, you can use indexes for optimization.

If you don't want to use the sql statement command to view these contents, there are also methods, and I will teach you an easier one.

Query--->>Query Options--->>Advanced

2 selected by the red trap, remove the set statistics io/time on/off in the sql statement to try the effect. Oh yes, you succeeded. .

3. View the execution plan and explain the execution plan in detail

Select the query statement, click and look in the message, the following legend will appear

First of all, the sentence of my example is too simple, your whole complex, bear with it.

Analysis : Hovering the mouse over the icon will display the details of the execution of this step. Below each table displays an overhead percentage. The analysis station has a larger percentage. You can redesign the data structure or rewrite the sql statement to correct the analysis. This is optimized. If there is a scan table or a clustered index, it means that your index is inappropriate in the current query and has no effect. Then you need to modify and optimize your index. How to do it, you can follow me The SQL optimization tool in the previous article--Database Engine Optimization Advisor analyzes and optimizes indexes.

select query art

1. Ensure that redundant columns and rows are not queried.

  • Try to avoid the existence of select *, use specific columns instead of *, and avoid redundant columns
  • Use where to limit the specific data to be queried to avoid redundant rows
  • Use top, distinct keywords to reduce redundant duplicate lines

2. Use the distinct keyword with caution

When distinct is used when querying one field or few fields, it will avoid the appearance of duplicate data and bring optimization effects to the query.

However, when there are many query fields, the query efficiency will be greatly reduced.

From this figure, analyze it:

Obviously, the cpu time and occupancy time of the statement with distinct are higher than those without distinct. The reason is that when a lot of fields are queried, if distinct is used, the database engine will compare the data and filter out duplicate data. However, this comparison and filtering process will take up system resources and CPU time unceremoniously.

3. Use the union keyword with caution

The main function of this keyword is to combine the result sets of each query statement into one result set and return it to you. usage

copy code
copy code
<select statement 1>
union
<select statement 2>
union
<select statement 3>
...
copy code
copy code

Statements that satisfy union must satisfy: 1. The number of columns is the same. 2. The data types corresponding to the number of columns should be kept compatible.

Implementation process:

Execute the select statement in turn -->> merge the result set -->> sort the result set and filter duplicate records.

copy code
copy code
select * from
((order or left join order product on o.orderNum = op.orderNum)
inner join product p on op.proNum=p.productnum)  where p.id<10000
union
select * from
((order or left join order product on o.orderNum = op.orderNum)
inner join product p on op.proNum=p.productnum)  where p.id<20000 and p.id>=10000
union
select * from
((order or left join order product on o.orderNum = op.orderNum)
inner join product p on op.proNum=p.productnum) where p.id>20000 --- here you can write p.id>100 and the result is the same, because it has been filtered

----------------------------------Compare the upper and lower two statements ----------------------- ------------------------------
select * from
((order or left join order product on o.orderNum = op.orderNum)
inner join product p on op.proNum=p.productnum)
copy code
copy code

 

It can be seen that the efficiency is indeed low, so it is not necessary to avoid using it. In fact, there is the third part he performs: sorting the result set and filtering duplicate records. It can be seen that it is not a good bird. However, if the result set is not sorted and filtered, the efficiency is obviously higher than that of union. Are there any keywords that are not sorted and filtered? Answer, yes, it is union all, and union all can be used to optimize the union. .

4. Determine whether there is data in the table

select count(*) from product
select top(1) id from product

Obviously the following wins

5. Optimization of join queries

It's important to first figure out what the data you want looks like before you decide which connection to use.

The value sizes of various connections are:

  • The size of the inner join result set depends on the number of left and right tables that satisfy the conditions
  • The left join depends on the size of the left table, and the right is opposite.
  • Full join and cross join depend on the total amount of data in the left and right tables
copy code
copy code
select * from
( (select * from orde where OrderId>10000) o  left join orderproduct op on o.orderNum=op.orderNum )

select * from
(order o left join order product on on o.orderNum = op.orderNum)
 where o.OrderId>10000
copy code
copy code

 

It can be seen that reducing the amount of data in the join table can improve efficiency.

insert insert optimization

copy code
copy code
--create temporary table
create table #tb1
(
 id int,
 name nvarchar(30),
 createTime datetime
)
declare @i int
declare @sql varchar(1000)
set @i=0
while (@i<100000) --Insert 10w data in a loop
begin
  set @i=@i+1
  set @sql=' insert into #tb1 values('+convert(varchar(10),@i)+',''erzi'+convert(nvarchar(30),@i)+''','''+convert(nvarchar(30),getdate())+''')'
  exec(@sql)
end
copy code
copy code

My running time here is 51 seconds

copy code
copy code
--create temporary table
create table #tb2
(
 id int,
 name nvarchar(30),
 createTime datetime
)

declare @i int
declare @sql varchar(8000)
declare @j int
set @i=0
while (@i<10000) --Insert 10w data in a loop
begin
 set @j=0
 set @sql=' insert into #tb2 select '+convert(varchar(10),@i*100+@j)+',''erzi'+convert(nvarchar(30),@i*100+@j)+''','''+convert(varchar(50),getdate())+''''
 set @i=@i+1
 while(@j<10)
 begin   
   set @sql=@sql+' union all select '+convert(varchar(10),@i*100+@j)+',''erzi'+convert(nvarchar(30),@i*100+@j)+''','''+convert(varchar(50),getdate())+''''
   set @j=@j+1
 end
 exec(@sql)
end

drop table #tb2
select count(1) from #tb2
copy code
copy code

My running time here is about 20 seconds

Analysis description: insert into select batch insert, significantly improve efficiency. So try to avoid loop insertion one by one in the future.

Optimize modify delete statement

If you modify or delete too much data at the same time, it will cause high CPU utilization and affect other people's access to the database.

If you delete or modify too much data and use a single loop operation, it will be inefficient, that is, the operation time process will be very long.

So how do you do it?

The compromise is to operate the data in batches.

delete product where id<1000
delete product where id>=1000 and id<2000
delete product where id>=2000 and id<3000
.....

Of course, such an optimization method is not necessarily the best choice. In fact, all three methods are possible. It depends on the access heat of your system. The key is to understand what kind of statement and what kind of effect.

 

 

Summary: Optimization, the most important thing is that you usually design statements, database habits, and methods. If you don’t care about it, you need to analyze it patiently. However, the analysis process depends on your comprehension, needs, and knowledge level.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324912719&siteId=291194637