Hash Join only for the same connection, and only in the CBO optimization mode. Relative to the nested loop join, hash join more suited to handle large result sets
Hash Join implementation plan is the first one hash table (build table), the second probe table (probe table), generally do not call in the appearance, nested loop only inside look
hash tables are also known as the table, so-called probe table appearance
execution plan shaped both as:
nested Loop
outer the table - table drive
inner the table
hash the Join
Build the table (inner the table) - table drive
probe table (outer table)
Look at a picture, a general understanding of the process Hash Join:
Here learn more about Hash Join
Hash join concept i
H a basic idea of the algorithm is based on ash join small row sources (i.e. referred to build input build table previously mentioned, we write smaller tables is S, a large table B)
to establish a hash area may be present in the hash table in memory
and then use a large row sources (referred to as probe input, i.e. the previously mentioned probe table) previously built to detect the hash table
if the hash area memory is not large enough, will not hash table Area hash completely stored in memory
for this situation, Oracle linkages in using a hash function to build input and probe input is divided into a plurality of disjoint partitions
are referred to as Si and Bi, this stage is called partitioning phase; and a respective the partition, Si and Bi do Hash join, join this stage is called stage
if HASH table is too large a structure in memory, it is divided into several partition, temporary segment written to disk, it will write one more price a decrease in efficiency
as to the concept of small tables, for the hash join, it can be accommodated in the hash table pga petit table can be, for example typically:
pga_a ggregate_target big integer 1073741824
hash area size to a large physical use more than 40 M, so usually can accommodate hundreds of thousands of records of
hash area size defaults to 2 * sort_area_size, we can modify the size SORT_AREA_SIZE directly, HASH_AREA_SIZE will also change the
if your workarea_size_policy = auto, then we simply set pga_aggregate_target
but remember, this is a session-level parameters, sometimes, we tend to set the size of hash_area_size to drive about 1.6 times the table
-driven table only for nested loop join and hash join, but hash join index need not be present on the drive table, and nested loop join the urgent needs of
one to two million recorded on the table ten million records tables join, hash join is usually perform very well
, however, more with less, large and small, often difficult to quantify, the circumstances have specific analysis
if, after partition, a partition built for the hash table is too large, oracle on the use of the Join hash Loop nested
hash the Join the so-called nested-loops is to Si portion establishment hash table, and then reads all the Bi hash table built make connection
and then to Remaining Si establish hash table, then all of the Bi and built the hash table to make the connection until all the Si are connected over
(ii) the principle of Hash Join
Examination consider the following two sets of data:
S = {1,1,1,3,3,4,4,4,4,5,8,8,8,8,10}
B = {0,0,1, } 1,1,1,2,2,2,2,2,2,3,8,9,9,9,10,10,11
the Hash first step is determining the Join small table (i.e. build input) if hash area can be completely stored in memory,
if completely stored in memory, the establishment of hash table in memory, which is the easiest hash join
if not all stored in memory, the build input must be partitioned. The number of partitions, called OUT-Fan
Fan hash_area_size-OUT is determined and the cluster size. Where the cluster size is equal to * _hash_multiblock_io_count DB_BLOCK_SIZE
hash_multiblock_io_count is a hidden parameter, after the 9.0.1 no longer use
-
sys@ORCL> ed
-
Wrote file afiedt.buf
-
-
1 select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
-
2 from x$ksppi a,x$ksppcv b
-
3 where a.indx = b.indx
-
4* and a.ksppinm like \'%hash_multiblock_io_count%\'
-
sys@ORCL> /
-
-
NAME VALUE DESCRIPTION
-
------------------------------ ----- ------------------------------------------------------------
-
_hash_multiblock_io_count 0 number of blocks hash join will read/write at once
Internal Oracle using a hash function is applied to the linkage, the S and B is divided into a plurality of partitions
where we assume that the hash function is a function for the remainder, i.e., Mod (join_column_value, 10)
thus produced ten partition, as follows:
After such a partition, you only need to do to join (the so-called partition pairs) between the respective partition
if there is a partition is NULL, then the corresponding partition can be ignored join
in when the S partition table is read into memory , oracle i.e. unique values linkages recording, a so-called bitmap constructed vectors
it takes account for about 5% hash area of memory. Here is the {1,3,4,5,8,10}
When B partition table, the value of the bit vector in each of FIG linkages comparison, wherein if not, it is recorded discarded
in in our example, B in the table the following data will be discarded} {0,0,2,2,2,2,2,2,9,9,9,9,9
this process is filtered bitmap vector
as S1, after finished connector B1, Next, Si, Bi connection
here oracle will compare two partitions, select small that do build input, is the dynamic role reversal
this dynamic role reversal occurs in addition to the above first partition of the partition
(Iii) Hash Join algorithm is
the first step 1: determining whether all the small table can be stored in memory hash area, if you can, then do the memory hash join. If not, a second step switch
Step 2: Determine the number of fan-out
(Number of Partitions) * C < = Favm * M
wherein C is the Cluster size, a value of * HASH_MULTIBLOCK_IO_COUNT the DB_BLOCK_SIZE
Favm hash area as a percentage of memory can be used, usually about 0.8
M to Hash_area_size size
step 3: small table reading section S, using the internal hash function (referred to herein as hash_fun_1)
connecting the key value mapped to a partition while using hash_fun_2 produce additional function keys connected a hash value of
the hash value is used to create a hash table, and stored together with the connection key
step 4: establishment of build input bitmap vector
step 5: If there is no space in the memory, the partition is written to disk
step 6: reading the remaining portion of the small table S, the third step is repeated until all read small table S
Step 7: partition sorted by size, selected to establish several partitions hash table (the principle here is to select the number of partitions selected up)
Step 8: According to the previous hash value computed hash_fun_2 function, a hash table
9 step: reading table B, in the bitmap vector vector filtered bitmap
step 10: hash_fun_1 function of the data using the filtered data is mapped to the corresponding partition to, and calculates the hash value hash_fun_2
step 11: if falling partition in memory, then the front and memory hash table hash value obtained in the existing connections do hash_fun_2 function calculated by
the results written to disk actuator. If the partition is not falling in memory, then the corresponding partition table corresponding to the value of S together
Step 12: Table B continues reading, repeat step 9, Table B has been read until
Step 13: Read the corresponding (Si, Bi) do hash join. Here the dynamic role reversal occurs
Step 14: join if after partition, also the smallest partition large memory than the nested-loop hash occur
(Iv) Hash Join cost
⑴ the In-Memory Hash Join
Cost (HJ) = the Read (S) + Build the hash Table in Memory (the CPU) + the Read (B) + the Perform the In Memory the Join (the CPU)
ignoring cpu time, then:
Cost (HJ) = the Read (S) + the Read (B)
⑵ the Hash the Join the On-Disk
according to the procedure described above, we can see:
Cost (HJ) = Cost (HJ1) + Cost (HJ2)
where Cost (HJ1) of cost is the scan S, B table, and the memory can not be placed on the portion written to disk, the corresponding earlier in step 2 to step 12
cost (HJ2) is the cost of doing the nested-loop hash join, the corresponding front section 13 step to step 14
where Cost (HJ1) is approximately equal to the Read (S) + Read (B ) + Write ((SM) + (BB * M / S))
because in doing nested-loop hash join, for each chunk the build input, need to read the entire probe input, and therefore
Cost (HJ2) is approximately equal to the Read ((SM) + n * (BB * M / S)), where n is the number of nested-loop hash join need to loop: n = (S / F) / M
In general, if n is greater than ten words, hash join performance will be greatly reduced
can be seen from the formula of n, n and Fan-out is inversely proportional to, increase fan-out, can be reduced n
when hash_area_size is fixed, can be reduced cluster size to increase fan -out
from here we can see that improve hash_multiblock_io_count parameter value does not necessarily improve the performance of hash join
(V) Hash Join process
a second complete hash join as follows:
Zone 1 (bucket) to calculate the number of barrels points --Hash small table
an important factor determining hash join a small partition table (bucket) number
this number by the hash_area_size, hash_multiblock_io_count and db_block_size parameters together determine
Oracle retains hash Area header information 20% to storage partition, hash bitmap information, and hash tables
Thus, calculated this figure are:
number Bucket = 0.8 * HASH_AREA_SIZE / (hash_multiblock_io_count * db_block_size)
2 the hash calculating
the read table data small (referred to as R), and each data is calculated from a hash algorithm
Oracle using two calculated hash algorithm to calculate the speed of the fastest to reach a hash value (first hash value and second hash values)
and with respect to all the hash value (first hash value) of these partitions hash table becomes
3 to store data into memory hash
the data through the hash algorithm, according to the hash value (first hash value) of each bucket were placed respective bucket in
a second hash value is stored in each record in
4 Create a hash bitmap
At the same time, create a hash bitmap on a two hash value mappings
5 out of memory size portions are moved to disk
if the hash area is filled, the largest partition that will be written disk (temporary table space) up
any record will need to be written to the disk partition disk partition is updated cause
this is the case, it will seriously affect the performance, so be sure to try to avoid this situation
2-5 until the entire table of data read complete
six pairs of partitions sort
in order to make full use of memory, try to store more partitions, Oracle will sort them in memory according to the size of each partition
7 reads the big table data, hash matches
then began to read large data sheet (abbreviated S) are
sequentially read each record, its calculated hash value, and checks whether the hash value coincides with the partition of the memory
if it is, return to join the data
if the memory partition is not matching, it S write data in to a new partition, the partition R was also used for calculation of the same algorithm to calculate the hash value
that is the number of these S The new number of partitions to be produced and the number of partitions that the same set of R. The new partition is stored on disk (temporary table space)
to read 8 large tables full of all the data
has been carried out in accordance with 7, read all data until the completion of a large table
9 process data not join
this time produced a lot of partition join good data and calculated from the R and S are stored on the disk
10 of the secondary hash calculated
from the R and S partition concentrated extract a minimal partition, calculated using a second hash function and hash table in memory to create
reason for using the second hash function is to make a better distribution data
11 matches the second hash
from another data source (the hash in memory that partitions It belongs to a different data source) to read the data partition, matching the new hash table in memory. Join return data
12 to complete a full hash join
continue to follow the process the remaining 9-11 partition, until all processed
(Vi) Hash Join mode
O racle in, Hash Join has three modes: Optimal, One-Pass, Multi-Pass
⑴ Optimal
When the driving result sets can be placed in all the hash table hash area PGA, called Optimal, substantially as follows:
① The first drive table, the result set is driven
② generating the hash hash bulket Area, and is divided into a plurality of bulket group, as a partition, also generates a list of a bitmap, each representing a top bulket
join key ③ the result set do the hash function, the data dispersed in a respective partition of bulket
when the operation is completed, if the key value the only words of high, bulket where the data would be more uniform, there may be some data will be empty bucket inside of
correspondence on this bitmap flag is 0, there are buckets of data, the flag will be 1
④ start scanning the second tables of jion do hash key operation, should be to determine a partition of a bulket to detect
before the probe will look at whether the bulket the bitmap will be 1, if 0, indicating no data, this line directly discarded
⑤ If the bitmap is 1, then do an exact match in the barrel, after the OK determination, returning data
that is optimal hash join, his costs substantially two tables full table scan, Adding a trace amount of the hash calculation
blog opening of the figure in this case is described with
⑵ one-pass
if a small pga process, drive or watch the result set is large, more than the size of hash area, how would you do?
Of course, will use the temporary table space, then approach the oracle slightly more complex needs attention Austrian concept of a partition mentioned above
can be understood, the data is the result of two hash operations, make sure your partition, and then determine your bulket
assume hash area smaller than the entire hash table, but at least greater than the size a partition, this time to go is the one-pass
when we generate good hash table, the situation is part of the partition remain in memory, the other partition remain in the disk temporary table space
of course, there may be a partition in half memory, half the disk, the rest of the steps are as follows:
① scan the second table, to join the hash function keys do, determine a good corresponding partition and bulket
② view the bitmap, determining whether bulket data is not directly discarded
③ If there is data, and the partition is in memory, to enter the corresponding bucket to match exactly to match, the return line data, otherwise, discard
④ If the partition is , then this line of data into a disk staging them on disk is stored in the form of partition, bu lket way
⑤ When the second scan table is completed, the remaining tables and driving a lot of partition table generated by the probe, remains on the disk
⑥ Since the data partition on both sides and have done the same hash algorithm bulket, now only pairwise comparison of both sides of the partition data to
and at the time of comparison, oracle also optimized process, there is no strict relationship between the drive and driven
he will be driving in a partition selected one pair of smaller, until all of the disks partition to complete all join
can be found, compared to the optimal, cost him extra memory is not fit for the partition, re-read once, so called one-pass
as long as your memory is guaranteed to hold a partition, oracle will be room for maneuvers, each disk partition do-Pass One
⑶ Multi-Pass
this is the most complex, the worst hash join
at this time hash area to even a small partition also does not fit, a good drive after the scan table
may be only half partition remain in the hash area, the other half plus the other partition on the disk all in
similar remaining steps and one-pass parity, except for parti tion process
due to drive only half partition table in memory, the corresponding partition table data detection probe do
If a match is not on, this line can not be discarded directly, to the need to retain the disk, and drive the rest of the half partition table do join
exemplified here is half partition memory can hold, if less, then loaded, again join will be more times
when occurs multi-pass, the number of partition of physical time will increase significantly
(Vii) Hash Join bitmap
these bitmaps contains information about whether each hash partition Yes Yes value. It records the hash value of the data partition of
the greatest role of this bitmap is that if the probe input data in the memory and not in the hash table to match
first check the bitmap to determine whether there is no matching data is written to disk
those not match with the data (bitmap, no data on the respective partition) is no longer written to disk
(viii) Summary
① confirmed small table is table driven
② confirmation relates to the table and to the linkages analyzed
on linkage ③ If data uneven, it is recommended to do the histogram
④ If so, adjust the size or large hash_area_size pga_aggregate_target value
⑤ Hash Join adapted to connect a small table with a large table, return a large result set is connected
Reproduced in: http: //blog.itpub.net/26515977/viewspace-1207979/