6 | Read words from a text file and output a list of unique words

Transformation operation

The Transformation operation is used to create a new RDD from an RDD (Resilient Distributed Dataset). It is usually implemented by mapping, filtering, grouping and other operations on the elements of the original RDD. The Transformation operation will not be executed immediately, but is calculated lazily and will only be actually executed when the Action operation is triggered. The following are some common Transformation operations:

conversion operation describe
map(func) Applies a function to each element in the RDD func, returning a new RDD.
filter(func) Use a function functo filter the elements in the RDD and return a new RDD containing elements that meet the conditions.
flatMap(func) Similar to map, but each input element can be mapped to multiple output elements.
distinct() Returns a new RDD containing unique elements from the RDD.
groupByKey() Group elements in an RDD with the same key into an iterator.
reduceByKey(func) funcUse the function to aggregate elements with the same key .
sortByKey() Sort elements by key.
union

Guess you like

Origin blog.csdn.net/weixin_44510615/article/details/132630073