Spark has hadoop mapreduce has the advantage, but the output can be stored in memory, so that no write hdfs
Calculating data spark responsible for replacing mapreduce programming model, storage systems can not replace hdfs
spark features:
- Computing speed, memory 100 is calculated hadoop fold up, and the hard disk can be 10 times faster
- Easy to use, you can use java, scala, python and other programming
- General, SQL, streaming real-time calculation, MLib machine learning, graphX map calculation
- It can run on a variety of platforms
BatchProcessing: Batch
Stream Processing: Streaming
Ad-hoc Queries: ad hoc queries
Not only because of fast memory-based, as well as DAG