Query optimization and paging algorithm program is a massive database (b)

Select gid,fariqi,neibuyonghu,title from tgongwen

  Time: 128,470 milliseconds (i.e.: 128 seconds)

  (2) establishing clustered index on the primary key, to establish a non-clustered index on fariq:

select gid,fariqi,neibuyonghu,title from Tgongwen

where fariqi> dateadd(day,-90,getdate())

  Time: 53763 milliseconds (54 seconds)

  (3) the polymerization indexing on the date column (fariqi):

select gid,fariqi,neibuyonghu,title from Tgongwen

where fariqi> dateadd(day,-90,getdate())

  Time: 2423 milliseconds (two seconds)

  Although each statement extracted data is 250,000, the difference is huge variety of situations, particularly the establishment of the clustered index difference in the date column. In fact, if your database is really the capacity of 10 million words, based on the primary key ID column, just above the first and second case, the performance of overtime is on the page, simply can not be displayed. This is what I abandon ID column as the single most important factor in the clustered index.

  The method is derived above speed: added before each select statement: declare @d datetime

set @d=getdate()

After the select statement and added:

SELECT [statement execution takes time (ms)] = datediff (ms, @ d, getdate ())

  2, as long as the index can significantly improve query speed

  In fact, we can find the above example, the statement sections 2, 3 are identical, and the same index field; different non-polymeric index is only set up on the former fariqi field, which is established on this field the polymerization index, but the query speed is a big difference. So, is not simply index on any field can improve query speed.

  Statement from construction of the table, we can see that this table has 10 million data fariqi field has 5003 different records. Establishing a polymerization index on this field is very appropriate. In reality, every day we will send several documents, the date of posting these documents on the same, which is fully in line with the establishment of the clustered index requirements: "neither the vast majority are the same, but the same can not be only a handful" rule. From this, we establish "appropriate" aggregate index for us to improve query speed is very important.

  3, all the need to speed up the search fields are added to the clustered index to improve query speed

  As already mentioned: are inseparable from the field during data query is "date" as well as the user's own "user name." Since these two fields are so important, we can combine them to create a composite index (compound index).

  Many people think that as long as any of the fields added to the clustered index, you can speed up the search, it was also confused: If the query composite clustered index fields are separated, then the query speed will slow down it? With this issue, we look at search speed (the result set is 250,000 data) :( Date column fariqi first row in the starting column composite clustered index, the user name neibuyonghu row after row below)

  (1) select GID, fariqi, neibuyonghu, title from Tgongwen Where fariqi> 2004-5-5

  Query Speed: 2513 ms

  (2) Select gid, fariqi, NE buyonghu, title from Tgongwen Where fariqi> '2004-5-5' and show the buyonghu = "办公室"

  Query Speed: 2516 ms

  (3) Select gid, fariqi, NE buyonghu, title from the Tgongwen Where the buyonghu = "办公室"

  Query speed: 60280 ms

  From the above tests, we can see that if only the starting column of the clustered index and at the same time used as a query query speed composite clustered index of all columns are almost the same, even more than to spend the entire column of the composite index also slightly fast (in the query result set as the number of cases); if only the non-clustered index composite starting column as a query, then the index is of no effect. Of course, the speed of the query statement 1, 2 as the number of entries is the same as the query, if all the columns have to spend a composite index, query results and less then this will form a "cover index", and thus the optimum performance can be achieved . Also, please remember: The other columns regardless of whether you frequently use aggregate index, but its leading column must be the most frequently used columns.


Experience does not use an index on the (four) other books

  1, polymerization index is not faster than a speed of the primary key index polymeric

  The following are examples of statements :( data are extracted 250 000)

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi='2004-9-16'

  Time: 3326 ms

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where gid<=250000

  Time: 4470 ms

  Here, polymerization index nearly 1/4 faster than with the primary key index is not a polymerization velocity.

  2, when the polymerization is faster than the index by a general order by the primary key for the speed, especially in the case of a small amount of data

select gid,fariqi,neibuyonghu,reader,title from Tgongwen order by fariqi

  Time: 12936

select gid,fariqi,neibuyonghu,reader,title from Tgongwen order by gid

  Time: 18843

  Here, polymerization using a general index than the primary key for the order by, faster 3/10. In fact, if the small amount of data, then use the index as the sort column aggregation than use of non-clustered index speed is significantly more; and a great amount of data words, such as more than 100,000, no significant difference in the rate of both .

  3, a polymerization time within the index, the data representing the search time will be reduced by a percentage proportional to the entire table, regardless of how many polymerization index

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi>'2004-1-1'

  Time: 6343 ms (extraction 1000000)

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi>'2004-6-6'

  Time: 3170 milliseconds (extract 500,000)

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi='2004-9-16'

  Time: 3326 ms (and exactly the same results if the same sentence number acquired, then the greater-than and equal with the same number.)

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi>'2004-1-1' and fariqi<'2004-6-6'

  Time: 3280 ms

  4. Enter the date column will not have to slow down the minutes and seconds and query speed

  The following example, a total of one million data after January 1, 2004 data there are 500 000, but only two different dates, exact date to date; the data before 500,000, there are 5000 different , the date of the nearest second.

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi>'2004-1-1' order by fariqi

  Time: 6390 ms

select gid,fariqi,neibuyonghu,reader,title from Tgongwen where fariqi<'2004-1-1' order by fariqi

  Time: 6453 ms

  (E) Other Considerations

  "Water can carry a boat, can also capsize", the same index. An index helps to improve the retrieval performance, but excessive or inappropriate indexing system can lead to inefficiencies. Because each user added to the index in the table, the database will have to do more work. Too many indexes even lead index fragmentation.

  So, we want to establish a "proper" index system, especially for the polymerization index is created, should strive to make your database can get a high-performance play.

  Of course, in practice, due diligence as a database administrator, you still need more test a number of programs to find out what kind of program the highest efficiency, the most effective.

  Second, improve the SQL statement

  Many people do not know SQL statement in the SQL SERVER is how to implement, they are worried about what they are writing SQL statements that can be misinterpreted SQL SERVER. such as:

select * from table1 where name='zhangsan' and tID > 10000

  And execution:

select * from table1 where tID > 10000 and name='zhangsan'

  Some people do not know whether the efficiency of the above two statements are the same, because if has a simple statement from the point of view, these two statements really are not the same, if tID is an aggregate index, after one of only 10 000 from the table after Find record on the line; while the former will have to start with a full table lookup to see if there are several name = 'zhangsan', and then tID> 10000 query results according to the proposed restriction conditions.

  In fact, such worries are unnecessary. SQL SERVER has a "query optimizer analyzes", it can calculate where clause search condition and determine which index table scan can reduce the search space, that is, it can achieve automatic optimization.

  While the query optimizer can automatically where clause query optimization, but we still need to look at "query optimizer" works, such as non-so, sometimes the query optimizer will not be a quick reference as you intended.

  In the analysis phase, each stage of the query optimizer to view and query the decision to limit the amount of data to be scanned is useful. If a scanning stage can be used as a parameter (SARGs), then the call can be optimized, and can quickly obtain the desired data using the index.

  SARG definition: for limiting a search operation, since it usually refers to a particular match, a matching range is worth two or more conditions connected by AND. Form as follows:

Column name operator <constant or variable>

or

<Constant or variable> operator column name

  Column names can appear in one side of the operator, and the constant or variable appears in the other side of the operator. Such as:

Name = 'John Doe'

Price> 5000

5000 <price

Name = 'John Doe' and price> 5000

  If a form of expression can not meet the SARG, that it can not limit the scope of the search, that is, SQL SERVER must determine all the conditions for each row in the WHERE clause if it meets. Therefore, an index does not meet the SARG form of expression is useless.

  After the introduction to SARG, SARG we use to sum up well and some information on the different conclusions experiences encountered in practice:

  1, Like SARG depends on whether the statement is of a type used wildcards

  Such as: name like '% sheets', which belongs SARG

  And: name like '% Zhang', do not belong to SARG.

  The reason is the wildcard% at the opening of the string so that the index can not be used.

  2, or cause a full table scan

Name = 'John Doe' and price> 5000 symbols SARG, and: Name = 'John Doe' or price> 5000 is not eligible SARG. Use or cause a full table scan.

  3, the non-operator is not satisfied due to the function of the form SARG Statement

  SARG form does not meet the most typical situation statements include statements of non-operator, such as: NOT, =, <>, <,!>, NOT EXISTS, NOT IN, NOT LIKE the like, in addition to the function!!. Here are a few examples of the form does not meet the SARG:

ABS (price) <5000

Name like ‘%三’

  Some of expressions, such as:

WHERE price * 2> 5000

  SQL SERVER will be considered SARG, SQL SERVER will be converted to this formula:

WHERE price> 2500/2

  But we do not recommend such use, because sometimes SQL SERVER can not guarantee that such a transformation of the original expression is completely equivalent.

  4, IN OR comparable effects

  Statement:

Select * from table1 where tid in (2,3)

  with

Select * from table1 where tid=2 or tid=3

  Is the same, will cause a full table scan, if there is an index on tid, its index will fail.

  5, to minimize the use NOT

  6, exists in the implementation and efficiency is the same


Article Source: http://www.diybl.com/course/7_databases/sql/msshl/2007614/52157_2.html

Reproduced in: https: //www.cnblogs.com/200831856/articles/1381730.html

Guess you like

Origin blog.csdn.net/weixin_34308389/article/details/93711331