RDD principle

 

 

  • The method of the bottom package textFile MR read mode to read a file, the file before reading the first split, split default block size is a size
  • RDD provided to calculate the optimum position, it reflects the localization data. Reflects the big data "computing mobile data does not move" concept
  • K, V format RDD
    • If the data are stored inside the RDD tuple object, that we called RDD RDD K, V format
  • RDD elasticity (fault-tolerant)
    • The number of partition, there is no size limit, reflecting the resilience of the RDD
    • RDD dependencies, can be recalculated based on a RDD RDD
  • RDD distributed
    • RDD is composed Partition, partition distributed over different nodes

 

Guess you like

Origin www.cnblogs.com/xiangyuguan/p/11203150.html