Spark learning from 0 to 1 (10)-Spark tuning (4)-Executor's off-heap memory tuning

1. Adjust Executor's off-heap memory

Spark's underlying shuffle transmission method uses netty transmission. Netty applies for off-heap memory during network transmission (netty is zero copy), so off-heap memory is used. By default, this off-heap memory limit is 10% of the memory size of each Executor. When really processing big data, there will be problems here, causing Spark jobs to repeatedly crash and fail to run. At this time, you need to adjust this parameter. Adjust to at least 1G, even 2G, 4G.

Executor is performing shuffle write, and first obtains a piece of data from its own locally associated MapOutPutWorker. If the local Block Manager does not have it, name will use TransferService to remotely connect to Executor's Block Manager on other nodes to obtain data and try to establish a remote network connection. , And pull data. Frequent object creation makes the JVM heap memory overflow and garbage collection is performed. When the JVM of that Executor happens to be garbage collected, the Executor will stop working and cannot provide a response. The default timeout period of Spark's network connection is 60s. If the connection cannot be established in 60s, then the task has failed. If the task fails, the shuffle file cannot find error will appear.

1.1 How to adjust the waiting time?

Add in the task script submitted by ./spark-submit:

--conf spark.core.connection.ack.wait.timeout=300

1.2 How to adjust the off-heap memory?

​ Executor hangs due to insufficient memory or insufficient off-heap memory. The Block Manager on the corresponding Executor also hangs, and the corresponding shuffle map output file cannot be found. The reducer side cannot pull data. We can adjust the size of off-heap memory, how to adjust?

Add in ./spark-submit to submit the task script

  • yarn 下:

    --conf  spark.yarn.executor.memoryOverhead=2048
    
  • standalone 下:

    --conf  spark.executor.memoryOverhead=2048
    

Guess you like

Origin blog.csdn.net/dwjf321/article/details/109056333