Based OracleSQL optimized reading notes

Oracle in some of the common B-tree index access methods :

  1. The only index scan
  2. Index range scan
  3. Index full scan -> read a single block NOT NULL constraint comes ordering
  4. Index fast full scan -> multi-block read the order can not guarantee results   
  5. Index skip scan -> few distinct values ​​applicable circumstances leading column

 Method connection table:

1. sort-merge join

      • Proceed as follows:  
    1. First target SQL predicate specified conditions (if any) access to T1, and then access T1 connection table according to the results of the column to sort, after the sorted result set we write 1.
    2. Then the target specified in the SQL predicate condition (if any) to access T2, and with the results of the column to sort according to the connection table T2, the row sorted result set we write 2.
    3. The final result set and result set 2 performs a merge operation, removed from the matching pairs recorded as the final execution result of the sort-merge join.
      • Advantages and disadvantages:

Typically, the efficiency of the sort-merge join far less hash join, but a wider range of use of the former, the hash is typically connected to the connector can be used for the equivalent condition, the sort-merge join can also be used of other conditions ( e.g. <, <=,>,> =).

 

2. nested loop join

Relying on two tables of two nested loops in the connection table doing connection method to obtain a result set table (respectively the inner loop and outer loop).

      • Proceed as follows:
    1. First, the optimization will be determined according to a certain rule tables T1 and T2 of the table who was the driver table, whose table is being driven. Table drive for the outer loop, the inner loop is used to drive the table. It is assumed that the drive table is T1, the table is driven T2.
    2. Then the target SQL results specified predicate condition obtained after (if any) to drive to access tables T1, T1 to access the drive table referred to as a set.
    3. Then traverse the result set 1 while traversing the driving table T2, i.e. to remove the article recording drive result set 1, then traversing the driving table T2 and follow the connected condition to judge T2 whether matching records exist, then removed result set in the drive 1 second record, in the same connection condition table T2 is driven again traverse T2 and determines whether or not there is a matching record, the result set until all been traversed drive 1 records so far. The outer loop refers herein to traverse a result set corresponding to the drive cycles, traversing the inner loop means driven table T2 corresponding cycle. Obviously, the outer loop set corresponding to the result of driving a number of records, traversing the driving table T2 on the inner loop do many times, this is called "nested loop" means.
      • Advantages and disadvantages:
    1. If the number of records corresponding to the drive table driven result set small, and that there is a unique index in the connection table columns are driven (or good selectivity in the presence of a non-unique index columns driven connection table) , then the time nested loop connection efficiency will be very high: if the number of records, but many of the drives for driving the table of the result set, even if there is an index on columns connected to a driven table above, a nested loop the efficiency of connections will not be high.
    2. Fewer long as the drive records a result set, it has the prerequisites do nested loop join, and the result set is driven in the target driving schedule application SQL predicates specified conditions (if any) obtained after result set, the table may be so large as the driving table nested loop join, the key target SQL predicate condition specified amount of data (if any) can be driven down the result set.
    3. A nested loop join does not have an additional advantage of the method: nested loop join fast response can be achieved, i.e. it may be the first to have been connected to return and the recording condition is satisfied connection, without waiting for all the connected operations do all after the connection before returning the results. Although the sort-merge join and hash join can first return has been connected and connection records meet the conditions, without having to wait for all connection operations are done, but they are not the first time to return, because the sort-merge joins to wait until the first sort after doing a merge operation to begin returning data, and drive hash join have to wait until the result set corresponding to the hash Table after all built to begin returning data.

3. hash join

Hash join (Hash Join) is a Hash operation mainly depends on two tables when making the connection table to obtain a connection method of the result set.

      • Proceed as follows:
    1. Oracle will be determined based on the value of the number of Hash Partiton bunch of parameters (Hash Bucket composed Hash Partition, Hash Partition composition Hash Table).
    2. Tables T1 and T2 is applied in a certain predicate condition specified in the SQL result (if any), the small amount of data to give the concentration of the result set that is driving Oracle selected hash result set is connected, it is assumed here that T1 the amount of data corresponding to the result set is relatively small, referred to is S; T2 corresponding to the data amount is relatively large result set, referred to as B. Obviously this result set S is driven, B is driven by the result set.
    3. Oracle will then traverse the S, S reads each record in, and each record in accordance with the record in the hash join column do table T1. The hash will be built using two hash functions, these two hash functions simultaneously connected to the column calculates the hash value, we built these two hash functions are denoted by hash_func_1 and hash_func_2, they are calculated the hash value is divided and recorded as hash_value_1 hash_value_2.
    4. Hash_value_1 oracle then by the value of the corresponding record stored in the corresponding S Different Hash Bucket Hash Partiton's, and while the record stored with the recording Hash_func_2 also calculated by hash_values_2. Note that complete rows at Hash Bucket record in the target table is not stored, only need to store is located in the target SQL query associated with the target table columns and column connection would be sufficient. We note Hash Partiton each corresponding to S is Si.
    5. In building of Si, Oracle will construct a bitmap (BITAMP), the bitmap is used to mark whether each of Si contained Hash Bucket recording (i.e., number of records is greater than zero).
    6. If the amount of data S is large, then when building S corresponding Hash Table, it may be the case of PGA work area is full of appearance. Hash Partiton this time the number of records containing most of the work area Oracle will be written to disk (TEMP table space). Then will continue to build S corresponding to the Hash Table, in the process of building, if the work area was full again, then Oracle will continue to repeat the action, that is, the selection includes the largest number of records Hash Partiton and written back to disk. If you are building a record corresponding to the Hash Partition Oracle has previously been written back to disk, this time to update the Hash Partiton Oracle will go on disk, that the recording and hash_value_2 directly added to this already in the Hash Partition on the disk the corresponding Hash Bucket. Note, there may be extreme cases only part of the records in a Hash Partition's memory still, the rest of the Hash Partiton and all the remaining Hash Pattiton have been written back to disk.
    7. Hash Table of the process of constructing S corresponding to the above will continue, until you traverse all records in the S up.
    8. Oracle will then according to the number of all Si they contain records sorted, and these have been sorted according to the order Hash Patition and as completely as possible into memory (PGA work area), of course, if it is put high, can not let go of that part of the Hash Partition still located on the disk.
    9. So far Oracle has been processed S, B can now begin the process.
    10. Oracle will traverse B, B, each record is read, and the connection according to the record in the hash table column made of T2, and the hash hash in step 3 is exactly the same. That is still in the step 3 hash_func_1 and hash_func_2, and also to calculate the two hash values ​​hash_value_1 and hash_value_2.

     . . . . . Read the rest of it, too much the word

      • Advantages and disadvantages:
    1. Hash join will not necessarily be sorted, or in most cases do not require sorting.
    2. Hash table drive connector are connected to corresponding column should be as good as possible to selectively, and therefore the number of records Hash Bucket selectively affects corresponding, and the number of records Hash Buket will directly affect the Hash from Bulket find efficiency matching records. If a Hash contains too many records Bulket, you can seriously reduce the corresponding hash join execution efficiency, performance is typical at this time of the hash join execution did not end for a long time, the database resides on a database server cpu occupancy rate is high, but the logic of reading the target SQL consumption is very low, because at this time most of the time spent on all the records in the traverse above Hash Bucket Lane, while traversing Hash Bucket in the record of this action takes place PGA's work area, so they do not spend logical reads.
    3. Hash join is only applicable to CBO, it can only be used to connect the equivalent conditions (even hash anti-join, equijoins Oracle actually be converted into equivalent).
    4. Hash join is very suitable for the larger number of circumstances make the connection between the tables small table and a large table and connected result sets record, especially in connection selectively small column of the table a very good case, this time hashing the execution time of the connection and can be approximated as the full table scan large table rather time-consuming.

Self-understanding: that is to calculate the hash value of the drive table (small table) and stored in memory or on disk, and then calculates a hash value for each row in the table is driven (large table), and in traversing this large table, with Kazakhstan Is there hope the value of the matching records to find a small table. Return match record.

4. The connector Cartesian product

 

Guess you like

Origin www.cnblogs.com/studyking/p/11595595.html