Hive-mapjoin

设置mapjoin二种方式:会一种就足够了
第一种:

`set hive.auto.convert.join=true`;

查看是否设置成功:

set hive.auto.convert.join;

建表:
create table test1(cookieid string,cookietime string,pv int);
测试数据:
insert into test1 values(‘cookie1’,‘2017-12-10’,1);
insert into test1 values(‘cookie1’,‘2017-12-11’,5);
insert into test1 values(‘cookie1’,‘2017-12-12’,7);
insert into test1 values(‘cookie1’,‘2017-12-13’,3);
insert into test1 values(‘cookie1’,‘2017-12-14’,2);
insert into test1 values(‘cookie1’,‘2017-12-15’,4);
insert into test1 values(‘cookie1’,‘2017-12-16’,4);
insert into test1 values(‘cookie2’,‘2017-12-16’,6);
insert into test1 values(‘cookie2’,‘2017-12-12’,7);
insert into test1 values(‘cookie3’,‘2017-12-22’,5);
insert into test1 values(‘cookie2’,‘2017-12-24’,1);
insert into test1 values(‘a’,‘2017-12-01’,3);
insert into test1 values(‘b’,‘2017-12-00’,3);

第一种:
    > 
    > set hive.auto.convert.join=true;
hive> set hive.auto.convert.join;
hive.auto.convert.join=true
hive> select  t1.pv,t1.cookieid from test1  t1 join test1  t2 on t1.cookieid=t2.cookieid;
Query ID = hadoop_20190603212611_ebe1b1ab-e7dc-400e-a3bd-e00197cee052
Total jobs = 1
Execution log at: /tmp/hadoop/hadoop_20190603212611_ebe1b1ab-e7dc-400e-a3bd-e00197cee052.log
2019-06-03 21:26:22     Starting to launch local task to process map join;      maximum memory = 518979584
2019-06-03 21:26:26     Dump the side-table for tag: 0 with group count: 5 into file: file:/tmp/hadoop/c8519b0f-a173-4caa-b9b1-9365ec863423/hive_2019-06-03_21-26-11_597_4646555486833218992-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile10--.hashtable
2019-06-03 21:26:26     Uploaded 1 File to: file:/tmp/hadoop/c8519b0f-a173-4caa-b9b1-9365ec863423/hive_2019-06-03_21-26-11_597_4646555486833218992-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile10--.hashtable (431 bytes)
2019-06-03 21:26:26     End of local task; Time Taken: 3.913 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1559615069424_0004, Tracking URL = http://hadoop01:8088/proxy/application_1559615069424_0004/
Kill Command = /home/hadoop/apps/hadoop-2.6.5/bin/hadoop job  -kill job_1559615069424_0004
Hadoop job information for Stage-3: number of mappers: 2; number of reducers: 0
2019-06-03 21:26:49,625 Stage-3 map = 0%,  reduce = 0%
2019-06-03 21:27:12,749 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 7.01 sec
MapReduce Total cumulative CPU time: 7 seconds 10 msec
Ended Job = job_1559615069424_0004
MapReduce Jobs Launched: 
Stage-Stage-3: Map: 2   Cumulative CPU: 7.01 sec   HDFS Read: 12050 HDFS Write: 598 SUCCESS
Total MapReduce CPU Time Spent: 7 seconds 10 msec
OK
1       cookie1
5       cookie1
7       cookie1
3       cookie1
2       cookie1
4       cookie1
4       cookie1
1       cookie1
5       cookie1
7       cookie1
3       cookie1
2       cookie1
4       cookie1
4       cookie1
3       a

Starting to launch local task to process map join;上图发现这个说明使用mapjoin

猜你喜欢

转载自blog.csdn.net/weixin_42177380/article/details/90759292