Select the hive engine: tez and spark

background

mr engine 2 will be abandoned in the hive. The official recommended tez such as engines or spark.

select

also

Use directed acyclic graph. Memory computing.

spark

Simultaneously as batch and stream processing engine, reduce learning costs.

&& inconvenient problem

also:

Use of union or join operations in the hive sql

tez will task segmentation, each small task, create a file folder, as follows:

This will cause a very serious problem, if this table below, use this table to no avail tez, but the use of spark or mr,

Both engines are not traverse the contents of subfolders under the. Check out data is 0. And it is difficult constraints, others use the same engine,

So tez discarded in use. We select the most spark engine.

 

Guess you like

Origin www.cnblogs.com/drjava/p/10948865.html