Spark performance tuning and fault handling (3) Spark Shuffle tuning

1. Adjust the buffer size on the map side

During the running of the Spark task, if the amount of data processed by the map side of shuffle is relatively large, but the size of the buffer on the map side is fixed, it may happen that the buffered data on the map side is frequently spilled and written to disk files, which makes the performance very high. low,通过调节 map 端缓冲的大小,可以避免频繁的磁盘IO 操作,进而提升 Spark 任务的整体性能。

The default configuration of the map buffer is 32KB. If each task processes 640KB of data, 640/32 = 20 overflow writes will occur. If each task processes 64000KB of data, 64000/32=2000 overflow writes will occur. This has a serious impact on performance.

The configuration method of the map buffer is as follows:

val conf = new SparkConf().set("spark.shuffle.file.buffer", "64")  //默认配置是 32KB

2. Adjust the size of the data buffer on the reduce side

During the Spark Shuffle process, the buffer size of the shuffle reduce task determines the amount of data that the reduce task can buffer each time, that is, the amount of data that can be pulled each time.如果内存资源较为充足,适当增加拉取数据缓冲区的大小,可以减少拉取数据的次数,也就可以减少网络传输的次数,进而提升性能。

pulling reduce side data buffer size can spark.reducer.maxSizeInFlightbe parameterized, default to 48MB, the method of setting the parameters as follows:

val conf = new SparkConf().set("spark.reducer.maxSizeInFlight", "96")  //默认为48MB

3. Adjust the number of retries to pull data on the reduce side

During the Spark Shuffle process, when the reduce task pulls its own data, if it fails due to network abnormalities and other reasons, it will automatically retry. 对于那些包含了特别耗时的 shuffle 操作的作业,建议增加重试最大次数(比如 60 次),以避免由于 JVM 的 full gc 或者网络不稳定等因素导致的数据拉取失败。In practice, it has been found that for the shuffle process for extremely large amounts of data (billions to tens of billions), adjusting this parameter can greatly improve the stability.

pulling data terminal reduce the number of retries can spark.shuffle.io.maxRetriesbe set parameter, which represents the maximum number that can be retried. If the pull is still unsuccessful within the specified number of times, it may cause the job execution to fail. The default is 3.

The setting method of this parameter is as follows:

val conf = new SparkConf().set("spark.shuffle.io.maxRetries", "6")  //默认为 3

Fourth, adjust the reduce side to pull data waiting interval

During the Spark Shuffle process, when the reduce task pulls its own data, if it fails due to network abnormalities and other reasons, it will automatically retry. After a failure, it will wait a certain time interval before retrying 可以通过加大间隔时长(比如 60s),以增加 shuffle 操作的稳定性.

reduce data latency pulling end can spark.shuffle.io.retryWaitset the parameters, a default value is 5s.

The setting method of this parameter is as follows:

val conf = new SparkConf().set("spark.shuffle.io.retryWait", "10s")  //默认值为 5s

Five, adjust the SortShuffle sort operation threshold

For SortShuffleManager,如果 shuffle reduce task 的数量小于某一阈值则 shufflewrite 过程中不会进行排序操作,而是直接按照未经优化的 HashShuffleManager 的方式去写数据,但是最后会将每个 task 产生的所有临时磁盘文件都合并成一个文件,并会创建单独的索引文件。

When you use SortShuffleManager, if you really don’t need a sorting operation, then it is recommended to increase this parameter larger than the number of shuffle read tasks, then the map-side will not be sorted at this time, reducing the performance overhead of sorting, but In this way, a large number of disk files will still be generated, so the performance of shuffle write needs to be improved.

SortShuffleManager sorting operation is provided by the threshold value spark.shuffle.sort.bypassMergeThresholdsetting this parameter, the default value is 200.

The setting method of this parameter is as follows:

val conf = new SparkConf().set("spark.shuffle.sort.bypassMergeThreshold", "400")  //默认值为 200

Guess you like

Origin blog.csdn.net/weixin_43520450/article/details/108650863