Hive connection?

  Inner join (equivalent join) inner join, outer join (equivalent join) out join, Cartesian product (join), map-side join (supports unequal join).

  Map connection: If one of the connected tables is a small table, the small table can be completely put into the memory, and the reduce process of the conventional join can be omitted by matching with the large table row by row. In most cases hive will start a mapreduce task for each join. Usually, the data in each table to be connected will be distributed in different Maps for processing. That is, the value corresponding to the same key may exist in different Maps. This will have to wait until Reduce to connect. For Map Join to work smoothly, the following conditions must be met: except that the data of one table is distributed in different Maps, the data of other connected tables must have a complete copy in each Map.

  Usage scenarios: 1. One of the tables is smaller. 2. Need to do unequal join operation.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325079675&siteId=291194637