Article Directory
1. Detailed introduction to the countByKey operator in Spark
For the task of wordcount, we used reduceByKey
to aggregate the values of the same key to obtain the value corresponding to each key. This article will introduce another more convenient operator, which will directly countByKey
return the value corresponding to each key. How many values are there, returned in the form of map.
1. Function introduction
In Spark, countByKey
it is an action operator (Action Operator) used to operate RDD of key-value pairs. It is used to count the number of elements corresponding to each key, and returns a map (Map) representing the number of elements corresponding to each key.