Commonly used partitioning techniques for parallel databases

Rotation method: Scan the relationship sequentially, and store the i-th tuple on the disk labeled Dimod; this method ensures that the tuples are evenly distributed on multiple disks.

 

Hash partition: Select a hash function with a value range of {0, 1, …, n-1}, and hash the tuples in the relationship based on the partition attributes. If the hash function returns i, store it on the i-th disk.

 

Range division: This strategy divides the data file into several parts according to the value range of a certain attribute in the relationship, and stores them on the disk respectively. It can be seen that the round-robin method is most suitable for applications that scan the entire relationship, and can read data from several disks in parallel during scanning, load balance, and give full play to parallelism.

 

Guess you like

Origin blog.csdn.net/qq_41048982/article/details/100604536