Spark supports languages: scala, paython, java 1:2:4
Using spark sql (intermediate filtering, filtering) has the same three performances.
spark is faster than mapReduce
The cpu to memory ratio is 1:2, or 1:4
RDD: Elastic Data Distribution Set 5 Features Measure operations on RDD: 1. Transformation 2. Action
spark.sparkContext
API:
1: sc.textFile(""), load data from outside, and return the RDD type sc.textFile("").cache.count Remarks: It can be displayed only when count is called.
sc.textFile("").collect