Five ways for Spark to modify the number of RDD partitions

insert image description here

In Spark, the number of partitions to modify the RDD can be specified when creating the RDD, or it can be repartitioned by certain operations after the RDD is created. The following are the specific methods:

1. Specify the number of partitions when creating an RDD

1. parallelizeSpecify the number of partitions when using the method to create RDD

When using parallelizethe method Create RDD from an existing collection, you can specify the number of partitions. <

Guess you like

Origin blog.csdn.net/m0_47256162/article/details/132374992