Spark ‘PHYSICAL‘ memory limit. Current usage: 1.6 GB of 1.5 GB physical memory used；

Spark程序在运行过程中异常重启，查看错误日志如下：

Failing this attempt.Diagnostics: [2023-04-24 10:38:09.784]Container [pid=5935,containerID=container_1682299008692_0019_01_000001] is running 111923200B beyond the 'PHYSICAL' memory limit. Current usage: 1.6 GB of 1.5 GB physical memory used; 3.5 GB of 3.1 GB virtual memory used. Killing container.

Dump of the process-tree for container_1682299008692_0019_01_000001 :

简单翻译一下内容：
尝试失败。诊断:[2023-04-24 10:38:09.784]容器[进程id=5935,容器id= container_168229900869_0019_01_000001]运行111923200B超出’物理’内存限制。当前使用情况:1.6 GB / 1.5 GB物理内存已使用;3.1 GB的虚拟内存使用了3.5 GB。造成容器。

转储container_1682299008692_0019_01_000001的进程树:

解决办法：
提交Spark任务的时候设置内存
增加参数 –driver-memory 4G --executor-memory 4G
如下

spark-submit --master yarn --deploy-mode cluster --class com.bigdata.work.DataWorker --jars hdfs:///user/jar/* --driver-memory 4G --driver-cores 8 --num-executors 4 --executor-memory 4G /bigdata/sparkJob/DataWorker.jar

参数解释：

【–master yarn】在hadoop yarn 集群上运行

–master: 设置主节点 URL 的参数。
可 以 是 spark://host:port, mesos://host:port, yarn, yarn-cluster,yarn-client, local

- local： 本地机器。
- spark://host:port：远程 Spark 单机集群。
- yarn：yarn 集群

【–deploy-mode cluster】在集群内启动Spark程序

Driver 程序运行的地方，client 或者 cluster,默认是client。
允许选择是否在本地（使用 client 选项）启动 Spark 驱动程序，或者在集群内（使用 cluster 选项）的其中一台工作机器上启动。默认值是 client。

【–class com.bigdata.work.DataWorker】设置main类

主类名称，含包名

【–jars hdfs:///user/jar/*】

--jars 依赖的第三方jar包

【–driver-memory 4G】 Driver程序使用内存4G

 Driver 程序使用内存大小

【–driver-cores 2 】Driver使用核数2

driver 使用的 core,仅在 cluster 模式下，默认为 1。
driver 程序的使用 core 个数（默认为 1），仅限于 Spark standalone模式

【–num-executors 4】启动的executor数量

一共启动的 executor 数量，默认是 2 个。在 yarn 下使用

【–executor-memory 4G 】每个exector的内存4G

每个 executor 的内存，默认是1G

–executor-core 每个executor的核数，在yarn或者standalone下使用。

Spark ‘PHYSICAL‘ memory limit. Current usage: 1.6 GB of 1.5 GB physical memory used；

猜你喜欢