- The method of the bottom package textFile MR read mode to read a file, the file before reading the first split, split default block size is a size
- RDD provided to calculate the optimum position, it reflects the localization data. Reflects the big data "computing mobile data does not move" concept
- K, V format RDD
- If the data are stored inside the RDD tuple object, that we called RDD RDD K, V format
- RDD elasticity (fault-tolerant)
- The number of partition, there is no size limit, reflecting the resilience of the RDD
- RDD dependencies, can be recalculated based on a RDD RDD
- RDD distributed
- RDD is composed Partition, partition distributed over different nodes