spark Software Components架构图及Task Scheduler架构

Software Components:
Spark runs as a library in your program (1 instance per app)
Runs tasks locally or on cluster
Mesos, YARN or standalone mode
Accesses storage systems via Hadoop InputFormat API
Can use HBase, HDFS, S3

Task Scheduler
General task graphs
Automatically pipelines functions
Data locality aware
Partitioning aware to avoid

猜你喜欢

转载自coolsunchen.iteye.com/blog/2002207