Statistical Information Collection in Batch Processing of Oracle Skills

Background Information

In many system business processes, especially the ETL application and batch processing business of the data warehouse, a large number of intermediate tables or temporary tables are designed and used. DML operations such as truncate, bulk data insert, and delete are often performed on these tables, that is, the records of these tables often change in the extreme cases of 0 or large capacity.

These tables either do not have statistical information, or the statistical information collected in a fixed time window every day and night, does not accurately reflect the true situation of the data, which causes the CBO optimizer to not necessarily guarantee the optimization of the SQL statements accessed by these tables. To this end, the following strategies can be adopted.

solution

(1) Lock statistics

The above-mentioned locking statistical information technology can be adopted to collect and lock the statistical information of the typical data status of these tables in the batch processing business. In this way, no matter how the information of these tables changes, Oracle always generates the SQL statement execution plan based on the statistical information of the typical data state.

The advantage of this strategy is that it consumes less resources and can basically maintain the stability of the SQL statement execution plan, but the disadvantage is that it cannot choose the optimal execution plan completely based on changing data.

(2) Collect statistics in real time

The difference from the above strategy is that in the batch process, after extreme changes in the table data, statistical information is collected on the spot. E.g:

execute immediate 'truncate table  &TNAME'; 
commit; 
 
execute immediate ' exec DBMS_STATS. GATHER_TABLE_STATS 
    (ownname=>'&OWNER', tabname=>'&TNAME',estimate_percent=>10, 
     Degree=>8, Cascade=>TRUE, Granularity=>'ALL'); 

The advantages and disadvantages of this strategy are exactly the opposite of the above strategy, that is, the advantage is that the optimal execution plan can be selected according to the data that changes each time, and the disadvantage is that it consumes more resources.

(3) Use HINT technology (old-fashioned method, not recommended!)

Through the use of HINT technology in SQL statements, to ensure that the execution plan is in a stable and optimized state.

The advantage of this strategy is to make full use of the developer's own experience and level, instead of relying on Oracle statistical information collection; the disadvantage is that it will be troublesome if the developer uses the wrong HINT, and this strategy is too rigid and hard to write in the program Implementation plans, and data is ever-changing.

Guess you like

Origin blog.csdn.net/weixin_38623994/article/details/109231187