The hive table is dynamically partitioned by day to report an error

Originally, I wanted to import all the 365-day data in the newlogs table of the ods layer into the logs table of the dwd layer and partition by day, but an error was reported. The details are as follows

Before executing sql, enable dynamic partition and set parameters

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=3000;
set hive.exec.max.dynamic.partitions=6000;
set mapreduce.map.memory.mb=2048;
set mapreduce.reduce.memory.mb=3072;

The following is the hql statement

insert overwrite table dwd_myshops.dwd_logs partition(date)
select userid,event,time,goodid,title,price,shopid,mark,
from_unixtime(cast(time/1000 as bigint),'yyyyMMdd') date
from ods_myshops.ods_newlogs;

The error content is as follows

MapReduce Total cumulative CPU time: 17 seconds 220 msec
Ended Job = job_1616718205783_0010 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1616718205783_0010_m_000000 (and more) from job job_1616718205783_0010

Task with the most failures(4):
-----
Task ID:
  task_1616718205783_0010_m_000000

URL:
  http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1616718205783_0010&tipid=task_1616718205783_0010_m_000000
-----
Diagnostic Messages for this Task:
Error: Java heap space

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   Cumulative CPU: 17.22 sec   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 17 seconds 220 msec

Later modified the parameters of dynamic partition

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=3000;
set hive.optimize.sort.dynamic.partition=true;

hive.optimize.sort.dynamic.partition=true
This parameter can make each partition generate only one file, which can solve the OOM problem of dynamic partitioning,
but it will seriously reduce the speed of reduce processing and writing to a partition

Re-execute the hql statement at this time, and partition by day is successful
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_48482704/article/details/115295477
Recommended