Originally, I wanted to import all the 365-day data in the newlogs table of the ods layer into the logs table of the dwd layer and partition by day, but an error was reported. The details are as follows
Before executing sql, enable dynamic partition and set parameters
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=3000;
set hive.exec.max.dynamic.partitions=6000;
set mapreduce.map.memory.mb=2048;
set mapreduce.reduce.memory.mb=3072;
The following is the hql statement
insert overwrite table dwd_myshops.dwd_logs partition(date)
select userid,event,time,goodid,title,price,shopid,mark,
from_unixtime(cast(time/1000 as bigint),'yyyyMMdd') date
from ods_myshops.ods_newlogs;
The error content is as follows
MapReduce Total cumulative CPU time: 17 seconds 220 msec
Ended Job = job_1616718205783_0010 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1616718205783_0010_m_000000 (and more) from job job_1616718205783_0010
Task with the most failures(4):
-----
Task ID:
task_1616718205783_0010_m_000000
URL:
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1616718205783_0010&tipid=task_1616718205783_0010_m_000000
-----
Diagnostic Messages for this Task:
Error: Java heap space
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 17.22 sec HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 17 seconds 220 msec
Later modified the parameters of dynamic partition
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=3000;
set hive.optimize.sort.dynamic.partition=true;
hive.optimize.sort.dynamic.partition=true
This parameter can make each partition generate only one file, which can solve the OOM problem of dynamic partitioning,
but it will seriously reduce the speed of reduce processing and writing to a partition
Re-execute the hql statement at this time, and partition by day is successful