spark各种模式提交任务介绍

前言

本文章部分内容翻译自:

http://spark.apache.org/docs/latest/submitting-applications.html

应用提交

Spark的bin目录中的spark-submit脚本用于在集群上启动应用程序。它可以通过统一的界面使用Spark支持的所有集群管理器,因此您不必为每个集群管理器配置应用程序。

捆绑应用程序的依赖关系

如果您的代码依赖于其他项目,则需要将它们与应用程序一起打包,以便将代码分发到Spark集群。为此,请创建包含代码及其依赖项的程序集jar(或“uber”jar)。sbt和Maven都有汇编插件。在创建程序集jar时,将Spark和Hadoop列为提供的依赖项;这些不需要捆绑,因为它们是由集群管理器在运行时提供的。一旦你有了一个组装的jar,你可以在传递你的jar时调用bin/spark-submit脚本。对于Python,您可以使用spark-submit的--py-files参数添加.py,.zip或.egg文件,以便与您的应用程序一起分发。如果您依赖多个Python文件,我们建议将它们打包成.zip或.egg。

使用spark-submit启动应用程序

捆绑用户应用程序后,可以使用bin/spark-submit脚本启动它。此脚本负责使用Spark及其依赖项设置类路径,并且可以支持Spark支持的不同集群管理器和部署模式:

./bin/spark-submit \
  --class <main-class> \
  --master <master-url> \
  --deploy-mode <deploy-mode> \
  --conf <key>=<value> \
  ... # other options
  <application-jar> \
  [application-arguments]

上述一些常用的选项分别是:

--class:应用程序的入口点(例如org.apache.spark.examples.SparkPi)

--master:集群的主URL(例如spark://23.195.26.187:7077)

--deploy-mode:是在工作节点(集群)上部署驱动程序还是在本地部署为外部客户端(客户端)(默认值:客户端)。

--conf:key = value格式的任意Spark配置属性。对于包含空格的值,用引号括起“key = value”(如图所示)。

application-jar:包含应用程序和所有依赖项的捆绑jar的路径。URL必须在群集内部全局可见,例如,hdfs://path或所有节点上都存在的file://path。

application-arguments:传递给主类的main方法的参数(如果有的话)。

常见的部署策略是从与您的工作机器物理位于同一位置的网关机器(例如,独立EC2集群中的主节点)提交您的应用程序。在此设置中,客户端模式是合适的。在客户端模式下,驱动程序直接在spark-submit进程中启动,该进程充当群集的客户端。应用程序的输入和输出附加到控制台。因此,该模式特别适用于涉及REPL的应用程序(例如Spark shell)。

或者,如果您的应用程序是从远离工作机器的计算机提交的(例如,在笔记本电脑上本地提交),则通常使用群集模式来最小化驱动程序和执行程序之间的网络延迟。目前,独立模式不支持Python应用程序的集群模式。

对于Python应用程序,只需传递一个.py文件代替<application-jar>而不是JAR,并使用--py-files将Python的.zip,.egg或.py文件添加到搜索路径中。

有一些特定于正在使用的集群管理器的选项。例如,对于具有集群部署模式的Spark独立集群,您还可以指定--supervise以确保驱动程序在失败且退出代码为非零时自动重新启动。要枚举所有可用于spark-submit的选项,请使用--help运行它。

各种模式运行spark任务

local模式

[root@hadoop1 spark-2.4.0-bin-hadoop2.7]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master local examples/jarsamples_2.11-2.4.0.jar 
2019-02-21 22:13:58 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classepplicable
2019-02-21 22:14:00 INFO  SparkContext:54 - Running Spark version 2.4.0
2019-02-21 22:14:00 INFO  SparkContext:54 - Submitted application: Spark Pi
2019-02-21 22:14:00 INFO  SecurityManager:54 - Changing view acls to: root
2019-02-21 22:14:00 INFO  SecurityManager:54 - Changing modify acls to: root
2019-02-21 22:14:00 INFO  SecurityManager:54 - Changing view acls groups to: 
2019-02-21 22:14:00 INFO  SecurityManager:54 - Changing modify acls groups to: 
2019-02-21 22:14:00 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permiss(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2019-02-21 22:14:01 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 48468.
2019-02-21 22:14:01 INFO  SparkEnv:54 - Registering MapOutputTracker
2019-02-21 22:14:01 INFO  SparkEnv:54 - Registering BlockManagerMaster
2019-02-21 22:14:01 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topologyion
2019-02-21 22:14:01 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2019-02-21 22:14:01 INFO  DiskBlockManager:54 - Created local directory at /tmp/blockmgr-4ddfef66-4732-4271-b029-05332cfa70a9
2019-02-21 22:14:01 INFO  MemoryStore:54 - MemoryStore started with capacity 413.9 MB
2019-02-21 22:14:01 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2019-02-21 22:14:02 INFO  log:192 - Logging initialized @9713ms
2019-02-21 22:14:02 INFO  Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-02-21 22:14:02 INFO  Server:419 - Started @9891ms
2019-02-21 22:14:02 INFO  AbstractConnector:278 - Started ServerConnector@40e4ea87{HTTP/1.1,[http/1.1]}{192.168.217.201:4040}
2019-02-21 22:14:02 INFO  Utils:54 - Successfully started service 'SparkUI' on port 4040.
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1a38ba58{/jobs,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@24b52d3e{/jobs/json,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@15deb1dc{/jobs/job,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@57a4d5ee{/jobs/job/json,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5af5def9{/stages,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3a45c42a{/stages/json,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36dce7ed{/stages/stage,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@27a0a5a2{/stages/stage/json,null,AVAILABLE,@Sp
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7692cd34{/stages/pool,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@33aa93c{/stages/pool/json,null,AVAILABLE,@Spar
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@32c0915e{/storage,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@106faf11{/storage/json,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@70f43b45{/storage/rdd,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@26d10f2e{/storage/rdd/json,null,AVAILABLE,@Spa
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@10ad20cb{/environment,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7dd712e8{/environment/json,null,AVAILABLE,@Spa
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2c282004{/executors,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22ee2d0{/executors/json,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7bfc3126{/executors/threadDump,null,AVAILABLE,
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3e792ce3{/executors/threadDump/json,null,AVAILrk}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@53bc1328{/static,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@e041f0c{/,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6a175569{/api,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4102b1b1{/jobs/job/kill,null,AVAILABLE,@Spark}
2019-02-21 22:14:02 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@61a5b4ae{/stages/stage/kill,null,AVAILABLE,@Sp
2019-02-21 22:14:02 INFO  SparkUI:54 - Bound SparkUI to 192.168.217.201, and started at http://hadoop1.org.cn:4040
2019-02-21 22:14:02 INFO  SparkContext:54 - Added JAR file:/usr/hdp/spark-2.4.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.4 spark://hadoop1.org.cn:48468/jars/spark-examples_2.11-2.4.0.jar with timestamp 1550758442663
2019-02-21 22:14:02 INFO  Executor:54 - Starting executor ID driver on host localhost
2019-02-21 22:14:03 INFO  Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on por
2019-02-21 22:14:03 INFO  NettyBlockTransferService:54 - Server created on hadoop1.org.cn:37714
2019-02-21 22:14:03 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication polic
2019-02-21 22:14:03 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, hadoop1.org.cn, 37714, None)
2019-02-21 22:14:03 INFO  BlockManagerMasterEndpoint:54 - Registering block manager hadoop1.org.cn:37714 with 413.9 MB RAM, BlockMariver, hadoop1.org.cn, 37714, None)
2019-02-21 22:14:03 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, hadoop1.org.cn, 37714, None)
2019-02-21 22:14:03 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, hadoop1.org.cn, 37714, None)
2019-02-21 22:14:03 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@61a91912{/metrics/json,null,AVAILABLE,@Spark}
2019-02-21 22:14:05 INFO  SparkContext:54 - Starting job: reduce at SparkPi.scala:38
2019-02-21 22:14:06 INFO  DAGScheduler:54 - Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions
2019-02-21 22:14:06 INFO  DAGScheduler:54 - Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
2019-02-21 22:14:06 INFO  DAGScheduler:54 - Parents of final stage: List()
2019-02-21 22:14:06 INFO  DAGScheduler:54 - Missing parents: List()
2019-02-21 22:14:06 INFO  DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has noparents
2019-02-21 22:14:06 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1936.0 B, free 413.9 MB)
2019-02-21 22:14:07 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1256.0 B, free 413.9 
2019-02-21 22:14:07 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on hadoop1.org.cn:37714 (size: 1256.0 B, free: 4
2019-02-21 22:14:07 INFO  SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1161
2019-02-21 22:14:07 INFO  DAGScheduler:54 - Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scfirst 15 tasks are for partitions Vector(0, 1))
2019-02-21 22:14:07 INFO  TaskSchedulerImpl:54 - Adding task set 0.0 with 2 tasks
2019-02-21 22:14:07 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCE 7866 bytes)
2019-02-21 22:14:07 INFO  Executor:54 - Running task 0.0 in stage 0.0 (TID 0)
2019-02-21 22:14:07 INFO  Executor:54 - Fetching spark://hadoop1.org.cn:48468/jars/spark-examples_2.11-2.4.0.jar with timestamp 1553
2019-02-21 22:14:08 INFO  TransportClientFactory:267 - Successfully created connection to hadoop1.org.cn/192.168.217.201:48468 afte(0 ms spent in bootstraps)
2019-02-21 22:14:08 INFO  Utils:54 - Fetching spark://hadoop1.org.cn:48468/jars/spark-examples_2.11-2.4.0.jar to /tmp/spark-e9e2c8bda-9d3d-4a4f9671b0d9/userFiles-e2c1980d-6d11-48f1-8422-2b637ce7a1fb/fetchFileTemp701584033085131304.tmp
2019-02-21 22:14:09 INFO  Executor:54 - Adding file:/tmp/spark-e9e2c8b5-0a08-4dda-9d3d-4a4f9671b0d9/userFiles-e2c1980d-6d11-48f1-84e7a1fb/spark-examples_2.11-2.4.0.jar to class loader
2019-02-21 22:14:09 INFO  Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 867 bytes result sent to driver
2019-02-21 22:14:09 INFO  TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCE 7866 bytes)
2019-02-21 22:14:09 INFO  Executor:54 - Running task 1.0 in stage 0.0 (TID 1)
2019-02-21 22:14:09 INFO  Executor:54 - Finished task 1.0 in stage 0.0 (TID 1). 824 bytes result sent to driver
2019-02-21 22:14:09 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 2360 ms on localhost (executor driver) (1/2
2019-02-21 22:14:10 INFO  TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 180 ms on localhost (executor driver) (2/2)
2019-02-21 22:14:10 INFO  TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2019-02-21 22:14:10 INFO  DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:38) finished in 3.577 s
2019-02-21 22:14:10 INFO  DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:38, took 4.506013 s
Pi is roughly 3.142475712378562
2019-02-21 22:14:10 INFO  AbstractConnector:318 - Stopped Spark@40e4ea87{HTTP/1.1,[http/1.1]}{192.168.217.201:4040}
2019-02-21 22:14:10 INFO  SparkUI:54 - Stopped Spark web UI at http://hadoop1.org.cn:4040
2019-02-21 22:14:10 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2019-02-21 22:14:10 INFO  MemoryStore:54 - MemoryStore cleared
2019-02-21 22:14:10 INFO  BlockManager:54 - BlockManager stopped
2019-02-21 22:14:10 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2019-02-21 22:14:10 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2019-02-21 22:14:11 INFO  SparkContext:54 - Successfully stopped SparkContext
2019-02-21 22:14:11 INFO  ShutdownHookManager:54 - Shutdown hook called
2019-02-21 22:14:11 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-3f8eab55-786c-4663-9f99-ca779610ee0d
2019-02-21 22:14:11 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-e9e2c8b5-0a08-4dda-9d3d-4a4f9671b0d9

standalone模式

[root@hadoop1 spark-2.4.0-bin-hadoop2.7]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://192.168.217 examples/jars/spark-examples_2.11-2.4.0.jar
2019-02-21 22:19:37 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classepplicable
2019-02-21 22:19:38 INFO  SparkContext:54 - Running Spark version 2.4.0
2019-02-21 22:19:38 INFO  SparkContext:54 - Submitted application: Spark Pi
2019-02-21 22:19:39 INFO  SecurityManager:54 - Changing view acls to: root
2019-02-21 22:19:39 INFO  SecurityManager:54 - Changing modify acls to: root
2019-02-21 22:19:39 INFO  SecurityManager:54 - Changing view acls groups to: 
2019-02-21 22:19:39 INFO  SecurityManager:54 - Changing modify acls groups to: 
2019-02-21 22:19:39 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permiss(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2019-02-21 22:19:40 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 40178.
2019-02-21 22:19:40 INFO  SparkEnv:54 - Registering MapOutputTracker
2019-02-21 22:19:40 INFO  SparkEnv:54 - Registering BlockManagerMaster
2019-02-21 22:19:40 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topologyion
2019-02-21 22:19:40 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2019-02-21 22:19:40 INFO  DiskBlockManager:54 - Created local directory at /tmp/blockmgr-07435023-9ee3-4394-9217-79c2902a5a4d
2019-02-21 22:19:40 INFO  MemoryStore:54 - MemoryStore started with capacity 413.9 MB
2019-02-21 22:19:40 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2019-02-21 22:19:40 INFO  log:192 - Logging initialized @9124ms
2019-02-21 22:19:40 INFO  Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-02-21 22:19:40 INFO  Server:419 - Started @9269ms
2019-02-21 22:19:40 INFO  AbstractConnector:278 - Started ServerConnector@3a7b503d{HTTP/1.1,[http/1.1]}{192.168.217.201:4040}
2019-02-21 22:19:40 INFO  Utils:54 - Successfully started service 'SparkUI' on port 4040.
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6058e535{/jobs,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e9c413e{/jobs/json,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@57a4d5ee{/jobs/job,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3a45c42a{/jobs/job/json,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36dce7ed{/stages,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@47a64f7d{/stages/json,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@33d05366{/stages/stage,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@33aa93c{/stages/stage/json,null,AVAILABLE,@Spa
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@32c0915e{/stages/pool,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@106faf11{/stages/pool/json,null,AVAILABLE,@Spa
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@70f43b45{/storage,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@26d10f2e{/storage/json,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@10ad20cb{/storage/rdd,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7dd712e8{/storage/rdd/json,null,AVAILABLE,@Spa
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2c282004{/environment,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22ee2d0{/environment/json,null,AVAILABLE,@Spar
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7bfc3126{/executors,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3e792ce3{/executors/json,null,AVAILABLE,@Spark
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@53bc1328{/executors/threadDump,null,AVAILABLE,
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@26f143ed{/executors/threadDump/json,null,AVAILrk}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3c1e3314{/static,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@11963225{/,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3f3c966c{/api,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3a71c100{/jobs/job/kill,null,AVAILABLE,@Spark}
2019-02-21 22:19:40 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5b69fd74{/stages/stage/kill,null,AVAILABLE,@Sp
2019-02-21 22:19:41 INFO  SparkUI:54 - Bound SparkUI to 192.168.217.201, and started at http://hadoop1.org.cn:4040
2019-02-21 22:19:41 INFO  SparkContext:54 - Added JAR file:/usr/hdp/spark-2.4.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.4 spark://hadoop1.org.cn:40178/jars/spark-examples_2.11-2.4.0.jar with timestamp 1550758781088
2019-02-21 22:19:41 INFO  StandaloneAppClient$ClientEndpoint:54 - Connecting to master spark://192.168.217.201:7077...
2019-02-21 22:19:41 INFO  TransportClientFactory:267 - Successfully created connection to /192.168.217.201:7077 after 83 ms (0 ms sootstraps)
2019-02-21 22:19:42 INFO  StandaloneSchedulerBackend:54 - Connected to Spark cluster with app ID app-20190221221942-0000
2019-02-21 22:19:42 INFO  Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on por
2019-02-21 22:19:42 INFO  NettyBlockTransferService:54 - Server created on hadoop1.org.cn:55988
2019-02-21 22:19:42 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication polic
2019-02-21 22:19:42 INFO  StandaloneAppClient$ClientEndpoint:54 - Executor added: app-20190221221942-0000/0 on worker-2019022122063.217.202-51708 (192.168.217.202:51708) with 2 core(s)
2019-02-21 22:19:42 INFO  StandaloneSchedulerBackend:54 - Granted executor ID app-20190221221942-0000/0 on hostPort 192.168.217.202th 2 core(s), 1024.0 MB RAM
2019-02-21 22:19:42 INFO  StandaloneAppClient$ClientEndpoint:54 - Executor added: app-20190221221942-0000/1 on worker-2019022122063.217.203-54960 (192.168.217.203:54960) with 2 core(s)
2019-02-21 22:19:42 INFO  StandaloneSchedulerBackend:54 - Granted executor ID app-20190221221942-0000/1 on hostPort 192.168.217.203th 2 core(s), 1024.0 MB RAM
2019-02-21 22:19:42 INFO  StandaloneAppClient$ClientEndpoint:54 - Executor added: app-20190221221942-0000/2 on worker-2019022122063.217.201-45088 (192.168.217.201:45088) with 2 core(s)
2019-02-21 22:19:42 INFO  StandaloneSchedulerBackend:54 - Granted executor ID app-20190221221942-0000/2 on hostPort 192.168.217.201th 2 core(s), 1024.0 MB RAM
2019-02-21 22:19:42 INFO  StandaloneAppClient$ClientEndpoint:54 - Executor updated: app-20190221221942-0000/0 is now RUNNING
2019-02-21 22:19:42 INFO  StandaloneAppClient$ClientEndpoint:54 - Executor updated: app-20190221221942-0000/1 is now RUNNING
2019-02-21 22:19:42 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, hadoop1.org.cn, 55988, None)
2019-02-21 22:19:43 INFO  BlockManagerMasterEndpoint:54 - Registering block manager hadoop1.org.cn:55988 with 413.9 MB RAM, BlockMariver, hadoop1.org.cn, 55988, None)
2019-02-21 22:19:43 INFO  StandaloneAppClient$ClientEndpoint:54 - Executor updated: app-20190221221942-0000/2 is now RUNNING
2019-02-21 22:19:43 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, hadoop1.org.cn, 55988, None)
2019-02-21 22:19:43 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, hadoop1.org.cn, 55988, None)
2019-02-21 22:19:44 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@726a17c4{/metrics/json,null,AVAILABLE,@Spark}
2019-02-21 22:19:45 INFO  StandaloneSchedulerBackend:54 - SchedulerBackend is ready for scheduling beginning after reached minRegisurcesRatio: 0.0
2019-02-21 22:19:48 INFO  CoarseGrainedSchedulerBackend$DriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client:// (192.168.217.202:38810) with ID 0
2019-02-21 22:19:49 INFO  CoarseGrainedSchedulerBackend$DriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client:// (192.168.217.203:47346) with ID 1
2019-02-21 22:19:49 INFO  BlockManagerMasterEndpoint:54 - Registering block manager 192.168.217.202:50169 with 413.9 MB RAM, BlockM0, 192.168.217.202, 50169, None)
2019-02-21 22:19:50 INFO  BlockManagerMasterEndpoint:54 - Registering block manager 192.168.217.203:47358 with 413.9 MB RAM, BlockM1, 192.168.217.203, 47358, None)
2019-02-21 22:19:55 INFO  SparkContext:54 - Starting job: reduce at SparkPi.scala:38
2019-02-21 22:19:56 INFO  DAGScheduler:54 - Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions
2019-02-21 22:19:56 INFO  DAGScheduler:54 - Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
2019-02-21 22:19:56 INFO  DAGScheduler:54 - Parents of final stage: List()
2019-02-21 22:19:56 INFO  DAGScheduler:54 - Missing parents: List()
2019-02-21 22:19:56 INFO  DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has noparents
2019-02-21 22:20:03 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1936.0 B, free 413.9 MB)
2019-02-21 22:20:04 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1256.0 B, free 413.9 
2019-02-21 22:20:04 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on hadoop1.org.cn:55988 (size: 1256.0 B, free: 4
2019-02-21 22:20:05 INFO  SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1161
2019-02-21 22:20:05 INFO  DAGScheduler:54 - Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scfirst 15 tasks are for partitions Vector(0, 1))
2019-02-21 22:20:05 INFO  TaskSchedulerImpl:54 - Adding task set 0.0 with 2 tasks
2019-02-21 22:20:07 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, 192.168.217.202, executor 0, partition 0, PROC, 7870 bytes)
2019-02-21 22:20:07 INFO  TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, 192.168.217.203, executor 1, partition 1, PROC, 7870 bytes)
2019-02-21 22:20:12 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 192.168.217.203:47358 (size: 1256.0 B, free: 
2019-02-21 22:20:12 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 192.168.217.202:50169 (size: 1256.0 B, free: 
2019-02-21 22:20:15 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 8180 ms on 192.168.217.202 (executor 0) (1/
2019-02-21 22:20:15 INFO  TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 8179 ms on 192.168.217.203 (executor 1) (2/
2019-02-21 22:20:15 INFO  DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:38) finished in 16.998 s
2019-02-21 22:20:15 INFO  TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2019-02-21 22:20:15 INFO  DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:38, took 20.610491 s
Pi is roughly 3.1427357136785683
2019-02-21 22:20:16 INFO  AbstractConnector:318 - Stopped Spark@3a7b503d{HTTP/1.1,[http/1.1]}{192.168.217.201:4040}
2019-02-21 22:20:16 INFO  SparkUI:54 - Stopped Spark web UI at http://hadoop1.org.cn:4040
2019-02-21 22:20:17 INFO  StandaloneSchedulerBackend:54 - Shutting down all executors
2019-02-21 22:20:17 INFO  CoarseGrainedSchedulerBackend$DriverEndpoint:54 - Asking each executor to shut down
2019-02-21 22:20:18 INFO  StandaloneAppClient$ClientEndpoint:54 - Executor updated: app-20190221221942-0000/0 is now EXITED (Commanwith code 0)
2019-02-21 22:20:18 INFO  StandaloneSchedulerBackend:54 - Executor app-20190221221942-0000/0 removed: Command exited with code 0
2019-02-21 22:20:18 INFO  StandaloneAppClient$ClientEndpoint:54 - Executor added: app-20190221221942-0000/3 on worker-2019022122063.217.202-51708 (192.168.217.202:51708) with 2 core(s)
2019-02-21 22:20:18 INFO  StandaloneSchedulerBackend:54 - Granted executor ID app-20190221221942-0000/3 on hostPort 192.168.217.202th 2 core(s), 1024.0 MB RAM
2019-02-21 22:20:18 INFO  StandaloneAppClient$ClientEndpoint:54 - Executor updated: app-20190221221942-0000/3 is now RUNNING
2019-02-21 22:20:18 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2019-02-21 22:20:19 INFO  MemoryStore:54 - MemoryStore cleared
2019-02-21 22:20:19 INFO  BlockManager:54 - BlockManager stopped
2019-02-21 22:20:19 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2019-02-21 22:20:19 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2019-02-21 22:20:19 INFO  SparkContext:54 - Successfully stopped SparkContext
2019-02-21 22:20:20 INFO  ShutdownHookManager:54 - Shutdown hook called
2019-02-21 22:20:20 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-e9155121-edfe-4f64-b917-be3f9f62220a
2019-02-21 22:20:20 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-90e9b57b-c3d4-4572-8eec-906f121d6b98
[root@hadoop1 spark-2.4.0-bin-hadoop2.7]# 

yarn-cluster模式

所谓的yarn集群模式,就是讲spark任务提交给yarn,让yarn去执行相关的任务,因此需要在spark-env.sh文件中添加export HADOOP_CONF_DIR=/usr/hdp/hadoop-2.8.3/etc/hadoop,然后去执行相关的任务:

[root@hadoop1 spark-2.4.0-bin-hadoop2.7]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode
 cluster examples/jars/spark-examples_2.11-2.4.0.jar 1000
2019-02-22 00:07:32 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-02-22 00:07:34 INFO  RMProxy:98 - Connecting to ResourceManager at hadoop1/192.168.217.201:8032
2019-02-22 00:07:36 INFO  Client:54 - Requesting a new application from cluster with 2 NodeManagers
2019-02-22 00:07:36 INFO  Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (2048 MB per container)
2019-02-22 00:07:36 INFO  Client:54 - Will allocate AM container, with 1408 MB memory including 384 MB overhead
2019-02-22 00:07:36 INFO  Client:54 - Setting up container launch context for our AM
2019-02-22 00:07:37 INFO  Client:54 - Setting up the launch environment for our AM container
2019-02-22 00:07:37 INFO  Client:54 - Preparing resources for our AM container
2019-02-22 00:07:38 WARN  Client:66 - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
2019-02-22 00:07:59 INFO  Client:54 - Uploading resource file:/tmp/spark-2b75f51c-ce24-4767-aa38-6d3262b1c7cb/__spark_libs__7092311691544510332.zip -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0001/__spark_libs__7092311691544510332.zip
2019-02-22 00:08:26 INFO  Client:54 - Uploading resource file:/usr/hdp/spark-2.4.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.4.0.jar -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0001/spark-examples_2.11-2.4.0.jar
2019-02-22 00:08:27 INFO  Client:54 - Uploading resource file:/tmp/spark-2b75f51c-ce24-4767-aa38-6d3262b1c7cb/__spark_conf__4473735302996115715.zip -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0001/__spark_conf__.zip
2019-02-22 00:08:28 INFO  SecurityManager:54 - Changing view acls to: root
2019-02-22 00:08:28 INFO  SecurityManager:54 - Changing modify acls to: root
2019-02-22 00:08:28 INFO  SecurityManager:54 - Changing view acls groups to: 
2019-02-22 00:08:28 INFO  SecurityManager:54 - Changing modify acls groups to: 
2019-02-22 00:08:28 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2019-02-22 00:08:31 INFO  Client:54 - Submitting application application_1550757972410_0001 to ResourceManager
2019-02-22 00:08:34 INFO  YarnClientImpl:273 - Submitted application application_1550757972410_0001
2019-02-22 00:08:36 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:36 INFO  Client:54 - 
     client token: N/A
     diagnostics: [星期五 二月 22 00:08:35 +0800 2019] Scheduler has assigned a container for AM, waiting for AM container to be launched
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1550765313392
     final status: UNDEFINED
     tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0001/
     user: root
2019-02-22 00:08:37 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:38 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:39 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:40 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:41 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:42 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:43 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:44 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:45 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:46 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:47 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:48 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:49 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:50 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:51 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:52 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:53 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:54 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:55 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:56 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:57 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:58 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:08:59 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:00 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:01 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:02 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:03 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:04 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:05 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:06 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:07 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:08 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:09 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:10 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:11 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:12 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:13 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:14 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:15 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:16 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:17 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:18 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:19 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:20 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:21 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:22 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:23 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:24 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:25 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:26 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:27 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:28 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:29 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:30 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:31 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:32 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:33 INFO  Client:54 - Application report for application_1550757972410_0001 (state: ACCEPTED)
2019-02-22 00:09:34 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:34 INFO  Client:54 - 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: hadoop2.org.cn
     ApplicationMaster RPC port: 43210
     queue: default
     start time: 1550765313392
     final status: UNDEFINED
     tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0001/
     user: root
2019-02-22 00:09:35 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:36 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:37 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:38 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:39 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:40 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:41 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:42 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:43 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:44 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:45 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:46 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:47 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:48 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:49 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:50 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:51 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:52 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:53 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:54 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:55 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:56 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:57 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:58 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:09:59 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:00 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:01 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:02 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:03 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:04 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:05 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:06 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:07 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:08 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:09 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:10 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:11 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:12 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:13 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:14 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:15 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:16 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:17 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:18 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:19 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:20 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:21 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:22 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:23 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:24 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:25 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:26 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:27 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:28 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:29 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:30 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:31 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:32 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:33 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:35 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:36 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:37 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:38 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:39 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:40 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:41 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:42 INFO  Client:54 - Application report for application_1550757972410_0001 (state: RUNNING)
2019-02-22 00:10:43 INFO  Client:54 - Application report for application_1550757972410_0001 (state: FINISHED)
2019-02-22 00:10:43 INFO  Client:54 - 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: hadoop2.org.cn
     ApplicationMaster RPC port: 43210
     queue: default
     start time: 1550765313392
     final status: SUCCEEDED
     tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0001/
     user: root
2019-02-22 00:10:44 INFO  ShutdownHookManager:54 - Shutdown hook called
2019-02-22 00:10:44 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-afb05931-e273-4e9f-b38a-d3ca234dfb34
2019-02-22 00:10:44 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-2b75f51c-ce24-4767-aa38-6d3262b1c7cb
[root@hadoop1 spark-2.4.0-bin-hadoop2.7]# 

其他在python以及kubernets的相关运行程序就不在赘述了。

yarn-client模式

[root@hadoop1 spark-2.4.0-bin-hadoop2.7]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client examples/jars/spark-examples_2.11-2.4.0.jar
Warning: Master yarn-client is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead.
2019-02-22 00:31:15 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-02-22 00:31:16 INFO  SparkContext:54 - Running Spark version 2.4.0
2019-02-22 00:31:16 INFO  SparkContext:54 - Submitted application: Spark Pi
2019-02-22 00:31:16 INFO  SecurityManager:54 - Changing view acls to: root
2019-02-22 00:31:16 INFO  SecurityManager:54 - Changing modify acls to: root
2019-02-22 00:31:16 INFO  SecurityManager:54 - Changing view acls groups to: 
2019-02-22 00:31:16 INFO  SecurityManager:54 - Changing modify acls groups to: 
2019-02-22 00:31:16 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2019-02-22 00:31:18 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 34169.
2019-02-22 00:31:18 INFO  SparkEnv:54 - Registering MapOutputTracker
2019-02-22 00:31:18 INFO  SparkEnv:54 - Registering BlockManagerMaster
2019-02-22 00:31:18 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2019-02-22 00:31:18 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2019-02-22 00:31:18 INFO  DiskBlockManager:54 - Created local directory at /tmp/blockmgr-f9baa979-e964-46a9-b034-475ba5148562
2019-02-22 00:31:18 INFO  MemoryStore:54 - MemoryStore started with capacity 413.9 MB
2019-02-22 00:31:18 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2019-02-22 00:31:18 INFO  log:192 - Logging initialized @9175ms
2019-02-22 00:31:19 INFO  Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-02-22 00:31:19 INFO  Server:419 - Started @9359ms
2019-02-22 00:31:19 INFO  AbstractConnector:278 - Started ServerConnector@47a64f7d{HTTP/1.1,[http/1.1]}{192.168.217.201:4040}
2019-02-22 00:31:19 INFO  Utils:54 - Successfully started service 'SparkUI' on port 4040.
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@49ef32e0{/jobs,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3be8821f{/jobs/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@64b31700{/jobs/job,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@bae47a0{/jobs/job/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@74a9c4b0{/stages,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@85ec632{/stages/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1c05a54d{/stages/stage,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@214894fc{/stages/stage/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@10567255{/stages/pool,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@e362c57{/stages/pool/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1c4ee95c{/storage,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@79c4715d{/storage/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5aa360ea{/storage/rdd,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6548bb7d{/storage/rdd/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@e27ba81{/environment,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54336c81{/environment/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1556f2dd{/executors,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@35e52059{/executors/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@62577d6{/executors/threadDump,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@49bd54f7{/executors/threadDump/json,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6b5f8707{/static,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@17ae98d7{/,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@59221b97{/api,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@704b2127{/jobs/job/kill,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3ee39da0{/stages/stage/kill,null,AVAILABLE,@Spark}
2019-02-22 00:31:19 INFO  SparkUI:54 - Bound SparkUI to 192.168.217.201, and started at http://hadoop1.org.cn:4040
2019-02-22 00:31:19 INFO  SparkContext:54 - Added JAR file:/usr/hdp/spark-2.4.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.4.0.jar at spark://hadoop1.org.cn:34169/jars/spark-examples_2.11-2.4.0.jar with timestamp 1550766679506
2019-02-22 00:31:22 INFO  RMProxy:98 - Connecting to ResourceManager at hadoop1/192.168.217.201:8032
2019-02-22 00:31:22 INFO  Client:54 - Requesting a new application from cluster with 2 NodeManagers
2019-02-22 00:31:22 INFO  Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (2048 MB per container)
2019-02-22 00:31:22 INFO  Client:54 - Will allocate AM container, with 896 MB memory including 384 MB overhead
2019-02-22 00:31:22 INFO  Client:54 - Setting up container launch context for our AM
2019-02-22 00:31:22 INFO  Client:54 - Setting up the launch environment for our AM container
2019-02-22 00:31:22 INFO  Client:54 - Preparing resources for our AM container
2019-02-22 00:31:22 WARN  Client:66 - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
2019-02-22 00:31:38 INFO  Client:54 - Uploading resource file:/tmp/spark-bc395b60-843f-4e24-841c-1fb09330b89f/__spark_libs__4384247224971462772.zip -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0006/__spark_libs__4384247224971462772.zip
2019-02-22 00:31:53 INFO  Client:54 - Uploading resource file:/tmp/spark-bc395b60-843f-4e24-841c-1fb09330b89f/__spark_conf__7312670304741942310.zip -> hdfs://hadoop1:9000/user/root/.sparkStaging/application_1550757972410_0006/__spark_conf__.zip
2019-02-22 00:31:53 INFO  SecurityManager:54 - Changing view acls to: root
2019-02-22 00:31:53 INFO  SecurityManager:54 - Changing modify acls to: root
2019-02-22 00:31:53 INFO  SecurityManager:54 - Changing view acls groups to: 
2019-02-22 00:31:53 INFO  SecurityManager:54 - Changing modify acls groups to: 
2019-02-22 00:31:53 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2019-02-22 00:31:56 INFO  Client:54 - Submitting application application_1550757972410_0006 to ResourceManager
2019-02-22 00:31:56 INFO  YarnClientImpl:273 - Submitted application application_1550757972410_0006
2019-02-22 00:31:56 INFO  SchedulerExtensionServices:54 - Starting Yarn extension services with app application_1550757972410_0006 and attemptId None
2019-02-22 00:31:57 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:31:57 INFO  Client:54 - 
     client token: N/A
     diagnostics: AM container is launched, waiting for AM container to Register with RM
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1550766716248
     final status: UNDEFINED
     tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0006/
     user: root
2019-02-22 00:31:58 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:31:59 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:00 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:01 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:02 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:03 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:04 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:05 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:06 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:07 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:08 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:09 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:10 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:12 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:13 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:14 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:15 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:16 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:17 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:18 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:19 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:20 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:21 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:22 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:23 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:24 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:25 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:26 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:27 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:28 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:29 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:30 INFO  Client:54 - Application report for application_1550757972410_0006 (state: ACCEPTED)
2019-02-22 00:32:31 INFO  Client:54 - Application report for application_1550757972410_0006 (state: RUNNING)
2019-02-22 00:32:31 INFO  Client:54 - 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: 192.168.217.203
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1550766716248
     final status: UNDEFINED
     tracking URL: http://hadoop1:8088/proxy/application_1550757972410_0006/
     user: root
2019-02-22 00:32:31 INFO  YarnClientSchedulerBackend:54 - Application application_1550757972410_0006 has started running.
2019-02-22 00:32:31 INFO  Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51287.
2019-02-22 00:32:31 INFO  NettyBlockTransferService:54 - Server created on hadoop1.org.cn:51287
2019-02-22 00:32:31 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2019-02-22 00:32:31 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, hadoop1.org.cn, 51287, None)
2019-02-22 00:32:31 INFO  BlockManagerMasterEndpoint:54 - Registering block manager hadoop1.org.cn:51287 with 413.9 MB RAM, BlockManagerId(driver, hadoop1.org.cn, 51287, None)
2019-02-22 00:32:31 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, hadoop1.org.cn, 51287, None)
2019-02-22 00:32:31 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, hadoop1.org.cn, 51287, None)
2019-02-22 00:32:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1788cb61{/metrics/json,null,AVAILABLE,@Spark}
2019-02-22 00:32:32 INFO  YarnClientSchedulerBackend:54 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> hadoop1, PROXY_URI_BASES -> http://hadoop1:8088/proxy/application_1550757972410_0006), /proxy/application_1550757972410_0006
2019-02-22 00:32:32 INFO  JettyUtils:54 - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill, /metrics/json.
2019-02-22 00:32:32 INFO  YarnClientSchedulerBackend:54 - SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
2019-02-22 00:32:33 INFO  YarnSchedulerBackend$YarnSchedulerEndpoint:54 - ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
2019-02-22 00:32:35 INFO  SparkContext:54 - Starting job: reduce at SparkPi.scala:38
2019-02-22 00:32:37 INFO  DAGScheduler:54 - Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions
2019-02-22 00:32:37 INFO  DAGScheduler:54 - Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
2019-02-22 00:32:37 INFO  DAGScheduler:54 - Parents of final stage: List()
2019-02-22 00:32:37 INFO  DAGScheduler:54 - Missing parents: List()
2019-02-22 00:32:37 INFO  DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
2019-02-22 00:32:47 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1936.0 B, free 413.9 MB)
2019-02-22 00:32:48 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1256.0 B, free 413.9 MB)
2019-02-22 00:32:48 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on hadoop1.org.cn:51287 (size: 1256.0 B, free: 413.9 MB)
2019-02-22 00:32:49 INFO  SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1161
2019-02-22 00:32:49 INFO  DAGScheduler:54 - Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1))
2019-02-22 00:32:49 INFO  YarnScheduler:54 - Adding task set 0.0 with 2 tasks
2019-02-22 00:33:04 WARN  YarnScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 00:33:10 INFO  YarnSchedulerBackend$YarnDriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.217.202:43729) with ID 1
2019-02-22 00:33:11 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, hadoop2.org.cn, executor 1, partition 0, PROCESS_LOCAL, 7877 bytes)
2019-02-22 00:33:11 INFO  BlockManagerMasterEndpoint:54 - Registering block manager hadoop2.org.cn:33875 with 413.9 MB RAM, BlockManagerId(1, hadoop2.org.cn, 33875, None)
2019-02-22 00:33:12 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on hadoop2.org.cn:33875 (size: 1256.0 B, free: 413.9 MB)
2019-02-22 00:33:13 INFO  TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, hadoop2.org.cn, executor 1, partition 1, PROCESS_LOCAL, 7877 bytes)
2019-02-22 00:33:13 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 2285 ms on hadoop2.org.cn (executor 1) (1/2)
2019-02-22 00:33:13 INFO  TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 132 ms on hadoop2.org.cn (executor 1) (2/2)
2019-02-22 00:33:13 INFO  DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:38) finished in 34.651 s
2019-02-22 00:33:13 INFO  YarnScheduler:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2019-02-22 00:33:13 INFO  DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:38, took 37.594449 s
Pi is roughly 3.14281571407857
2019-02-22 00:33:13 INFO  AbstractConnector:318 - Stopped Spark@47a64f7d{HTTP/1.1,[http/1.1]}{192.168.217.201:4040}
2019-02-22 00:33:13 INFO  SparkUI:54 - Stopped Spark web UI at http://hadoop1.org.cn:4040
2019-02-22 00:33:13 INFO  YarnClientSchedulerBackend:54 - Interrupting monitor thread
2019-02-22 00:33:13 INFO  YarnClientSchedulerBackend:54 - Shutting down all executors
2019-02-22 00:33:13 INFO  YarnSchedulerBackend$YarnDriverEndpoint:54 - Asking each executor to shut down
2019-02-22 00:33:14 INFO  SchedulerExtensionServices:54 - Stopping SchedulerExtensionServices
(serviceOption=None,
 services=List(),
 started=false)
2019-02-22 00:33:14 INFO  YarnClientSchedulerBackend:54 - Stopped
2019-02-22 00:33:14 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2019-02-22 00:33:14 INFO  MemoryStore:54 - MemoryStore cleared
2019-02-22 00:33:14 INFO  BlockManager:54 - BlockManager stopped
2019-02-22 00:33:14 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2019-02-22 00:33:14 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2019-02-22 00:33:14 INFO  SparkContext:54 - Successfully stopped SparkContext
2019-02-22 00:33:14 INFO  ShutdownHookManager:54 - Shutdown hook called
2019-02-22 00:33:14 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-ea671cae-988b-4f5b-a85f-184ed2dba58d
2019-02-22 00:33:15 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-bc395b60-843f-4e24-841c-1fb09330b89f

Master URLS

传递给Spark的Master URL可以采用以下格式之一:

  • local:使用一个工作线程在本地运行Spark(即根本没有并行性)。
  • local[K]:使用K个工作线程在本地运行Spark(理想情况下,将其设置为计算机上的核心数)。
  • local[K,F]:使用K个工作线程和F个maxFailures在本地运行Spark(有关此变量的说明,请参阅spark.task.maxFailures)
  • local[*]:使用与计算机上的逻辑核心一样多的工作线程在本地运行Spark。
  • local [*,F]:本地运行Spark,其中包含与计算机和F maxFailures上的逻辑核心一样多的工作线程。
  • spark://HOST:PORT:连接到给定的Spark独立集群主服务器。端口必须是主服务器配置使用的端口,默认为7077。
  • spark://HOST1:PORT1,HOST2:PORT2:使用Zookeeper的备用主服务器连接到给定的Spark独立群集。该列表必须具有使用Zookeeper设置的高可用性群集中的所有主主机。端口必须是每个主服务器配置使用的默认端口,默认为7077。
  • mesos://HOST:PORT:连接到给定的Mesos群集。端口必须是您配置使用的端口,默认为5050。或者,对于使用ZooKeeper的Mesos集群,请使用mesos://zk://....要使用--deploy-mode集群进行提交,应将HOST:PORT配置为连接到MesosClusterDispatcher。
  • yarn:以客户端或集群模式连接到YARN集群,具体取决于--deploy-mode的值。将根据HADOOP_CONF_DIR或YARN_CONF_DIR变量找到群集位置。
  • k8s://HOST:PORT:以群集模式连接到Kubernetes群集。客户端模式目前不受支持,将来的版本将支持。HOST和PORT参考[Kubernetes API服务器](https://kubernetes.io/docs/reference/generated/kube-apiserver/)。它默认使用TLS连接。为了强制它使用不安全的连接,您可以使用k8s://http://HOST:PORT。

从文件加载配置

spark-submit脚本可以从属性文件加载默认的Spark配置值,并将它们传递给您的应用程序。默认情况下,它将从Spark目录中的conf/spark-defaults.conf中读取选项。有关更多详细信息,请参阅有关加载默认配置的部分。

以这种方式加载默认Spark配置可以避免某些标志需要spark-submit。例如,如果设置了spark.master属性,则可以安全地从spark-submit中省略--master标志。通常,在SparkConf上显式设置的配置值采用最高优先级,然后传递给spark-submit的标志,然后是默认文件中的值。

如果您不清楚配置选项的来源,可以通过使用--verbose选项运行spark-submit来打印细粒度的调试信息。

高级依赖管理

使用spark-submit时,应用程序jar以及--jars选项中包含的任何jar都将自动传输到群集。-jars之后提供的URL必须用逗号分隔。该列表包含在驱动程序和执行程序类路径中。目录扩展不适用于--jars。

Spark使用以下URL方案来允许传播jar的不同策略:

file:-绝对路径和文file://URI由驱动程序的HTTP文件服务器提供服务,每个执行程序从驱动程序HTTP服务器提取文件。

hdfs:、http:、https:、ftp:-这些从URI中按预期下拉文件和JAR。

local:- 以local:/开头的URI应该作为每个工作节点上的本地文件存在。这意味着不会产生任何网络IO,并且适用于推送给每个工作者或通过NFS,GlusterFS等共享的大型文件/JAR。

请注意,JAR和文件将复制到执行程序节点上的每个SparkContext的工作目录中。随着时间的推移,这会占用大量空间,需要进行清理。使用YARN,可以自动处理清理,使用Spark standalone,可以使用spark.worker.cleanup.appDataTtl属性配置自动清理。

用户还可以通过使用--packages提供以逗号分隔的Maven坐标列表来包含任何其他依赖项。使用此命令时将处理所有传递依赖项。可以使用标志--repositories以逗号分隔的方式添加其他存储库(或SBT中的解析程序)。(请注意,在某些情况下,可以在存储库URI中提供受密码保护的存储库的凭据,例如在https://user:password@host/ ....以这种方式提供凭据时要小心。)这些命令可以是与pyspark,spark-shell和spark-submit一起使用以包含Spark Packages。

对于Python,可以使用等效的--py-files选项将.egg,.zip和.py库分发给执行程序。

更多信息

部署应用程序后,集群模式概述描述了分布式执行中涉及的组件,以及如何监视和调试应用程序。

坚壁清野

猜你喜欢

转载自www.cnblogs.com/hzhiping/p/10412923.html