With parallel query allows SQL Server to accelerate the run

Parallel query its advantages is the ability to process the query operations by multiple threads, thus improving the efficiency of queries. SQL Server database database server with multiple CPU parallel query functionality is provided in order to optimize the performance of the search process is. In other words, as long as the database server has multiple CPU, the database system can use multiple operating system processes are executed in parallel query operations to accelerate the completion of query operations.

First, the parallel three-step inquiry.

Parallel query operations in the database, primarily through three steps.

First, the database will determine the need for parallel query. There is a query optimizer, will optimize the SQL statements in the database, then the database will be to execute the query. And when this query in SQL statements to query optimization, which is to determine whether an action query optimization for SQL statements. In other words, not all of the SQL query can get benefits from the parallel query. If the query optimizer thinks query can get benefits from the parallel query, then the exchange operator will be inserted into the query execution plan for the query in parallel to prepare. So what statements need to adopt parallel query, which does not require, that database administrators do not care. Database query optimizer will help administrators make this decision. Database administrators need to be clear that, in which case, the database SQL optimizer will consider it appropriate to adopt parallel query. Under normal circumstances, as long as any of the following conditions, it will not be parallel query execution. First, for a particular query, the query optimizer serial query execution plans considered faster than any possible parallel execution plan; the second is a serial query execution cost is not high, it does not require parallel query; third is not included in the query running parallel scalar operators or relational operators. If speaking from the database administrator's perspective, the third condition has the greatest impact on us. When the database is expected in the future might use to improve database performance parallel query, the database design, you need to pay attention to avoid the use of those operators can not be used in parallel query function. Because some of the relational operators or logical operators may be required to query plan must be carried out in serial mode, or part needs to be in serial mode. If this is so, the query optimizer will not use parallel query functionality to improve the performance of the query. This is a database administrator in the database design must take into account the details of a problem.

Second, determine the number of parallel processes. When the query optimizer inserted intersect operators in the query, the database will be parallel query execution. Parallel Query can use multiple threads in the implementation plan. At this point, it has encountered a problem, the database will process the query operation is divided into several operating it? At this point, the database administrator needs to know what is the degree of parallelism. in fact. In dealing with parallel query, the data need to know the process can be used with maximum process actually used. The process used is called the maximum degree of parallelism. This value is the degree of parallelism set at the server level, may be modified by system stored procedure. However, the maximum number is not necessarily equal to the actual use of the process is the process number. The actual process is used to determine the number of databases are initialized at query plan execution time. In other words, this is not the database administrator to additional settings. The database system will automatically determine a reasonable number of processes based on the complexity of the plan. Of course, the actual number of processes employed can not exceed the degree of parallelism, i.e. the maximum number of processes can be used.

Finally, execute the query. When the above determine the good, the database will execute a specific query. In this step, we need to pay attention to a problem. The database administrator can also specify query MAXDOP query hint to modify the progress of this value. That is, if a query job database administrators might think consuming relatively long time, that you can work this query the progress set large value. When using the MAXDOP query hint to set the parallel progress value, it will override the default values ​​set in advance. In order to achieve the progress of the installation of additional value for a single query, in order to improve the performance of some specific query operations.

Second, pay attention to the content of the parallel query needs.

Note a point: the need to pay attention to hardware limitations.

Parallel query is an effective measure to improve database query performance. But it is often constrained by relatively large. For example, some outside based on cost considerations, there are some hard limit increase above. As Typically, only the case of multiple microprocessor (CPU) of the database in the database server will consider the implementation of parallel query. That is subject to only a computer with multiple CPU's to be able to use the parallel query. This is a hard limit conditions. Also during query execution plan, the database will then determine whether there is a sufficient number of threads can be used. Each query requires a certain number of threads that can execute; and the implementation of a parallel plan requires more threads than a serial execution plan, the required number of threads will increase with the degree of parallelism increases. If the parallel program execution time, when the database server does not have enough threads to make parallel plan to use it, the database engine will automatically reduce the degree of parallelism, or even give up the serial instead of parallel query plans. So, if the database query can be executed in parallel, it is restricted by its hardware. For this reason, if the company really need to improve database performance parallel query, then the administrator will need to adjust the hardware configuration according to the situation.

 

  Note Point two: do not recommend using parallel query to all queries.

In general, I believe that only the best large tables join query, the polymerization operation of large amounts of data, sequentially repetitive large result sets and so only the operation of the parallel query application function. If you perform these operations in parallel query, then the effect of improving its database performance is very obvious. Conversely, if a simple query for parallel query execution, it may perform additional coordination needed for parallel query will be greater than the potential performance gains. Therefore, the database administrator in determining the need to perform parallel query feature, you need to be careful. The author of the proposal is, at the database server level, it is best not to set up a parallel inquiry. I.e., the degree of parallelism is set to a relatively small value or 1. Then for some special query, use MAXDOP query hint to set the maximum number of processes can be used. So, it may be more reasonable. If the database administrator does not know if sometimes need to adopt parallel query function, then you can judge by the database comes with statistics. To distinguish between parallel query plan in the end there is no benefit from parallel query, the database engine can perform queries of the estimated cost and the parallel query cost threshold are compared. Only a parallel plan usually more beneficial to the query takes longer; because of its performance advantage will offset the initialization, synchronization and termination of extra time overhead required a parallel plan.

Note point three: the database will be to determine whether to parallel query based on the number of lines of inquiry involved.

Speaking above, preferably join queries on large tables, the polymerization operation of large amounts of data, sequentially repetitive large result sets and so only the operation of the parallel query application function. Because only this, the benefits of parallel query will exceed the cost of their pay. However, not that join queries, aggregate operations, sorting operations are suitable for parallel query. When the database query plan in parallel, when considering the query optimizer will go to determine the number of lines involved. If less involved in several rows, will not consider parallel query execution plan. The serial mode will execute a query. The case, could avoid start, distribution, coordination costs greatly exceed the benefits of parallel execution of the job brings. It would have been a good design, but also bring some trouble to the database administrator. Such as database administrators now want to test the parallel query in the end can affect the extent to query operation, a little trouble. Because the amount of data it has limitations. If the database administrator needs to perform this test, but also had to import enough data for the job in the database system. This limits the test operation of the database administrator. But having said that, this mechanism is still good. Because database administrators do not have to consider, when the database size to how much time a parallel query.

Note Point four: the same operation will use a number of different processes at different times.

Mentioned above, too, the number of parallel query to the number of processes in addition to the first use of the complexity of the operations associated with, but also directly associated with the time server status, such as whether there is sufficient number of processes and so on. Therefore, at different times, even for the same data, the same operation, with the number of parallel query process may be different. We need time will be different. Because only in parallel query real time performed, the database engine only to collect the current system workload, such as the number of processes, and some other configuration information, then the database before deciding on the optimum number of parallel processes. From the beginning of the query, the query to the end of this operation, the number has been using this process. If next time you want to research, the database engine will continue to collect this information. At this time, if the system workload improved in the database may use a few more processes to execute the query. Thus the performance of the search process will be more high. Conversely, if the duty ratio at this time before a query system to be heavier, the database may be using less of the process to handle the job. At this time, the second query speed but more slowly. So, if other applications are deployed in the database server, other applications of how much system resources will also affect parallel execution of produce is difficult to estimate.

Reproduced in: https: //www.cnblogs.com/flysun0311/archive/2012/03/07/2383404.html

Guess you like

Origin blog.csdn.net/weixin_34378767/article/details/93444341