spark map源码
/** * Return a new RDD by applying a function to all elements of this RDD. */ def map[U: ClassTag](f: T => U): RDD[U] = withScope { val cleanF = sc.clean(f) new MapPartitionsRDD[U, T](this, (context, pid, iter) => iter.map(cleanF)) }
scala map 源码
/** Creates a new iterator that maps all produced values of this iterator * to new values using a transformation function. * * @param f the transformation function * @return a new iterator which transforms every value produced by this * iterator by applying the function `f` to it. * @note Reuse: $consumesAndProducesIterator */ def map[B](f: A => B): Iterator[B] = new AbstractIterator[B] { def hasNext = self.hasNext def next() = f(self.next()) }map将RDD原分区的 iterator 的每一个元素调用 传入函数 f ,底层用Scala的map 方法, 回调函数map的next,将每一个元素进行计算处理,最后返回一个新的RDD,新的RDD的分区数 保持不变。