Spark Learning Road (12) Resource Tuning of SparkCore Tuning

Excerpted from: https://tech.meituan.com/spark-tuning-basic.html

I. Overview

After developing a Spark job, it is time to configure the appropriate resources for the job. The resource parameters of Spark can basically be set as parameters in the spark-submit command. Many Spark beginners usually don't know which necessary parameters to set and how to set these parameters. In the end, they can only set them randomly, or even not set them at all. The unreasonable setting of resource parameters may result in insufficient utilization of cluster resources, and the job will run extremely slowly; or if the set resources are too large, the queue does not have enough resources to provide them, resulting in various exceptions. In conclusion, in either case, the Spark job will run inefficiently or not at all. Therefore, we must have a clear understanding of the resource usage principles of Spark jobs, and know which resource parameters can be set during the running process of Spark jobs, and how to set appropriate parameter values.

2. Basic operation principle of Spark job

The detailed principle is shown in the figure above. After we submit a Spark job using spark-submit, the job will start a corresponding Driver process. Depending on the deployment mode you use (deploy-mode), the Driver process may be started locally or on a worker node in the cluster. The Driver process itself will occupy a certain amount of memory and CPU core according to the parameters we set. The first thing the Driver process has to do is to apply to the cluster manager (either a Spark Standalone cluster or another resource management cluster, Meituan-Dianping uses YARN as a resource management cluster) to apply for running Spark jobs The resources to be used, the resources here refer to the Executor process. The YARN cluster manager will start a certain number of Executor processes on each worker node according to the resource parameters we set for the Spark job. Each Executor process occupies a certain amount of memory and CPU cores.

After applying for the resources required for job execution, the Driver process will begin to schedule and execute the job code we wrote. The Driver process will split the Spark job code we wrote into multiple stages, each stage executes a part of the code fragment, and creates a batch of tasks for each stage, and then assigns these tasks to each Executor process for execution. A task is the smallest computing unit, responsible for executing the exact same computing logic (that is, a code fragment written by ourselves), but the data processed by each task is different. After all tasks of a stage are executed, the intermediate calculation results will be written to the local disk files of each node, and then the Driver will schedule to run the next stage. The input data of the task of the next stage is the intermediate result output by the previous stage. This cycle repeats until all the code logic we wrote ourselves has been executed, and all the data have been calculated to get the result we want.

Spark divides stages according to shuffle operators. If a shuffle class operator (such as reduceByKey, join, etc.) is executed in our code, then a stage boundary will be divided at the operator. It can be roughly understood that the code before the execution of the shuffle operator will be divided into one stage, and the code executed after the shuffle operator will be divided into the next stage. Therefore, when a stage is first executed, each of its tasks may pull all the keys that need to be processed by itself from the node where the task of the previous stage is located, and then use all the same keys that are pulled. We write our own operator functions to perform aggregation operations (such as the functions received by the reduceByKey() operator). This process is called shuffle.

When we perform persistence operations such as cache/persist in the code, depending on the persistence level we choose, the data calculated by each task will also be saved to the memory of the Executor process or the disk file of the node where it is located.

Therefore, the memory of the Executor is mainly divided into three parts: the first one is to let the task execute the code we wrote, which by default accounts for 20% of the total memory of the Executor; the second one is to let the task pull the previous stage through the shuffle process After the output of the task, it is used for aggregation and other operations, and the default is 20% of the total memory of the Executor; the third block is used to make the RDD persistent, which accounts for 60% of the total memory of the Executor by default.

The execution speed of the task is directly related to the number of CPU cores of each Executor process. A CPU core can only execute one thread at a time. The multiple tasks assigned to each Executor process are run concurrently by multiple threads in the form of one thread per task. If the number of CPU cores is sufficient and the number of tasks allocated is reasonable, then in general, these task threads can be executed quickly and efficiently.

The above is the description of the basic operation principle of Spark job, you can understand it in combination with the above figure. Understanding the basic principles of operations is the basic prerequisite for resource parameter tuning.

3. Resource parameter tuning

After understanding the basic principles of Spark job operation, it is easy to understand the parameters related to resources. The so-called Spark resource parameter tuning is actually mainly to optimize the efficiency of resource use by adjusting various parameters in the various places where resources are used in the Spark running process, thereby improving the execution performance of Spark jobs. The following parameters are the main resource parameters in Spark. Each parameter corresponds to a certain part of the operation principle of the job. We also give a reference value for tuning.

3.1 num-executors

  • Parameter description: This parameter is used to set the total number of Executor processes to be executed by the Spark job. When the Driver requests resources from the YARN cluster manager, the YARN cluster manager will try to start the corresponding number of Executor processes on each worker node of the cluster according to your settings. This parameter is very important. If it is not set, only a small number of Executor processes will be started for you by default. At this time, the running speed of your Spark job is very slow.
  • Parameter tuning suggestion: It is generally appropriate to set about 50 to 100 Executor processes for the running of each Spark job, and it is not good to set too few or too many Executor processes. If it is set too little, the cluster resources cannot be fully utilized; if it is set too much, most of the queues may not be able to give sufficient resources.

3.2 executor-memory

  • Parameter description: This parameter is used to set the memory of each Executor process . The size of Executor memory directly determines the performance of Spark jobs in many cases, and is also directly related to common JVM OOM exceptions .
  • Parameter tuning suggestion: It is more appropriate to set the memory of each Executor process to 4G~8G . However, this is only a reference value, and the specific settings still have to be determined according to the resource queues of different departments. You can see what the maximum memory limit of your team's resource queue is. Multiplying num-executors by executor-memory cannot exceed the maximum memory of the queue. In addition, if you share this resource queue with other people in the team, the amount of memory requested should not exceed 1/3~1/2 of the maximum total memory of the resource queue, so as to avoid your own Spark jobs occupying all the resources of the queue , causing other students' homework to fail to run.

3.3 performer-colors

  • Parameter description: This parameter is used to set the number of CPU cores for each Executor process . This parameter determines the ability of each Executor process to execute task threads in parallel. Because each CPU core can only execute one task thread at a time, the more CPU cores each Executor process has, the faster it can execute all task threads assigned to it.
  • Parameter tuning suggestion: It is more appropriate to set the number of CPU cores of Executor to 2~4. It also has to be determined according to the resource queues of different departments. You can see what the maximum CPU core limit of your own resource queues is, and then determine how many CPU cores each Executor process can allocate based on the number of Executors set. It is also recommended that if the queue is shared with others, it is appropriate that num-executors * executor-cores should not exceed about 1/3~1/2 of the total CPU cores of the queue, and also to avoid affecting other students' homework. The best should be one cpu core corresponding to two or three tasks

3.4 driver-memory

  • Parameter description: This parameter is used to set the memory of the Driver process.
  • Parameter tuning suggestion: The memory of the Driver is usually not set, or about 1G should be enough. The only thing to note is that if you need to use the collect operator to pull all the data of the RDD to the Driver for processing, you must ensure that the memory of the Driver is large enough, otherwise there will be an OOM memory overflow problem.

3.5 spark.default.parallelism

  • Parameter description: This parameter is used to set the default number of tasks for each stage. This parameter is extremely important, if not set may directly affect the performance of your Spark job. A partition corresponds to a task, that is, this parameter actually sets the number of tasks
  • Parameter tuning suggestion: The default number of tasks for Spark jobs is 500~1000. A mistake that many students often make is not to set this parameter, then Spark will set the number of tasks according to the number of blocks in the underlying HDFS. The default is that one HDFS block corresponds to one task. Generally speaking, the number of default settings of Spark is too small (for example, dozens of tasks). If the number of tasks is too small, the parameters of the Executor you set earlier will be lost. Just imagine, no matter how many Executor processes you have, how much memory and CPU you have, but there are only 1 or 10 tasks, then 90% of Executor processes may have no task execution at all, which is a waste of resources! Therefore, the setting principle suggested by the Spark official website is that it is more appropriate to set this parameter to 2~3 times of num-executors * executor-cores. For example, the total number of CPU cores of Executor is 300, then it is possible to set 1000 tasks. At this time The resources of the Spark cluster can be fully utilized.

3.6 spark.storage.memoryFraction

  • Parameter description: This parameter is used to set the proportion of RDD persistent data in the Executor memory, the default is 0.6. That is to say, 60% of the memory of the default Executor can be used to save persistent RDD data. Depending on the persistence strategy you choose, if there is not enough memory, the data may not be persisted, or the data will be written to disk.
  • Parameter tuning suggestion: If there are many RDD persistent operations in the Spark job, the value of this parameter can be appropriately increased to ensure that the persistent data can be accommodated in memory. To avoid insufficient memory to cache all data, data can only be written to disk, which reduces performance. However, if there are more shuffle operations and fewer persistent operations in the Spark job, it is appropriate to lower the value of this parameter appropriately. In addition, if it is found that the job runs slowly due to frequent gc (the gc time of the job can be observed through the spark web ui), which means that the memory of the task to execute the user code is not enough, then it is also recommended to lower the value of this parameter.

3.7 spark.shuffle.memoryFraction

  • Parameter description: This parameter is used to set the proportion of Executor memory that can be used for aggregation operations after a task pulls the output of the task of the previous stage during the shuffle process. The default is 0.2. That is to say, the Executor defaults to only 20% of the memory used for this operation. When the shuffle operation is performing aggregation, if it is found that the memory used exceeds the 20% limit, the excess data will be overwritten to the disk file, which will greatly reduce the performance.
  • Parameter tuning suggestion: If there are few RDD persistent operations and many shuffle operations in the Spark job, it is recommended to reduce the memory ratio of persistent operations and increase the memory ratio of shuffle operations to avoid insufficient memory when there is too much data in the shuffle process. If it is used, it must be overwritten to disk, reducing performance. In addition, if it is found that the job is running slowly due to frequent gc, which means that the memory of the task to execute the user code is not enough, then it is also recommended to lower the value of this parameter.

There is no fixed value for the tuning of resource parameters. Students need to be based on their actual situation (including the number of shuffle operations in Spark jobs, the number of RDD persistent operations, and the job gc status displayed in the spark web ui). According to the principles and tuning suggestions given in this article, set the above parameters reasonably.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325162049&siteId=291194637