Spark: Core RDD

RDD: Resilient Distributed Dataset elastic distributed data.

RDD five characteristics:

  1. RDD is composed of a series of partition
  2. Operators (function) acting on the partition of the RDD
  3. There are dependencies between RDD
  4. Acting on the partition is formatted kv RDD

Guess you like

Origin www.cnblogs.com/wbyixx/p/11111893.html
Recommended