spark2.3.1 安装过程

1.安装scalar

下载scalar,解压到路径

/usr/local/scalar

在这里插入图片描述
在/etc/profile文件中加入安装路径

vim /etc/profile

添加以下内容

export SCALA_HOME=/usr/local/scala/scala-2.12.7
export PATH=$PATH:$SCALA_HOME/bin

执行文件

source /etc/profile

安装完成,验证是否成功:

scala -version

要在每一个节点上都安装配置scalar,可以安装完spark后一起分发给其他节点

2.安装spark

在这里插入图片描述
安装解压到以下路径

/usr/local/spark

编辑/etc/profile文件,增加:

export SPARK_HOME=/usr/local/spark/spark-2.3.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin

配置conf目录下的文件:
进入目录/usr/local/spark/spark-2.3.1-bin-hadoop2.7/conf

cd /usr/local/spark/spark-2.3.1-bin-hadoop2.7/conf

新建spark-env.h文件:

cp spark-env.sh.template spark-env.sh

编辑spark-env.h文件:

vim spark-env.sh

添加以下内容:

export JAVA_HOME=/usr/local/jdk-11
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.8.5
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SCALA_HOME=/usr/local/scala/scala-2.12.7
export SPARK_HOME=/usr/local/spark/spark-2.3.1-bin-hadoop2.7
export SPARK_MASTER_IP=master
export SPARK_EXECUTOR_MEMORY=1G

新建slaves文件:

cp    slaves.template   slaves

编辑slaves文件,里面的内容删除,修改为:

slaver1
slaver2
slaver3
slaver4
slaver5

配置完成,分发给其他节点,并且完成/etc/profile文件的配置

scp -r spark slave1:/usr/local/
scp -r spark slave2:/usr/local/
scp -r spark slave3:/usr/local/
scp -r spark slave4:/usr/local/
scp -r spark slave5:/usr/local/

注:遇到的小问题,spark启动失败,报错如下:
在这里插入图片描述

[root@slaver1 logs]# cat spark-root-org.apache.spark.deploy.worker.Worker-1-slaver1.out
Spark Command: /usr/local/jdk-11/bin/java -cp /usr/local/spark/spark-2.3.1-bin-hadoop2.7/conf/:/usr/local/spark/spark-2.3.1-bin-hadoop2.7/jars/*:/usr/local/hadoop/hadoop-2.8.5/etc/hadoop/ -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://master:7077
========================================
2018-10-08 19:22:49 INFO  Worker:2611 - Started daemon with process name: 8029@slaver1
2018-10-08 19:22:49 INFO  SignalUtils:54 - Registered signal handler for TERM
2018-10-08 19:22:49 INFO  SignalUtils:54 - Registered signal handler for HUP
2018-10-08 19:22:49 INFO  SignalUtils:54 - Registered signal handler for INT
2018-10-08 19:22:50 ERROR SparkUncaughtExceptionHandler:91 - Uncaught exception in thread Thread[main,5,main]
java.lang.ExceptionInInitializerError
	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:80)
	at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:611)
	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273)
	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:791)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:761)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:634)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2467)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2467)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2467)
	at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:220)
	at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:784)
	at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:755)
	at org.apache.spark.deploy.worker.Worker.main(Worker.scala)
Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end 3, length 2
	at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3319)
	at java.base/java.lang.String.substring(String.java:1874)
	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:52)
	... 15 more
2018-10-08 19:22:50 INFO  ShutdownHookManager:54 - Shutdown hook called

解决:重新安装jdk

安装完成
在这里插入图片描述

3.测试

  • 访问ui
    害怕8080端口占用,可以修改访问端口:
cd /usr/local/spark/spark-2.3.1-bin-hadoop2.7/sbin
vim start-master.sh

修改划线位置:
在这里插入图片描述
在浏览器里访问Mster机器,我的Spark集群里MasterIP地址是192.168.144.130,访问8888端口,URL是:http://192.168.144.130:8888/

  • 运行Spark提供的计算圆周率的示例程序
cd /usr/local/spark/spark-2.3.1-bin-hadoop2.7
./bin/spark-submit  --class  org.apache.spark.examples.SparkPi  --master local   examples/jars/spark-examples_2.11-2.3.1.jar 

结果

[root@slaver1 spark-2.3.1-bin-hadoop2.7]# ./bin/spark-submit  --class  org.apache.spark.examples.SparkPi  --master local   examples/jars/spark-examples_2.11-2.3.1.jar 
2018-10-08 19:52:00 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-10-08 19:52:01 INFO  SparkContext:54 - Running Spark version 2.3.1
2018-10-08 19:52:01 INFO  SparkContext:54 - Submitted application: Spark Pi
2018-10-08 19:52:01 INFO  SecurityManager:54 - Changing view acls to: root
2018-10-08 19:52:01 INFO  SecurityManager:54 - Changing modify acls to: root
2018-10-08 19:52:01 INFO  SecurityManager:54 - Changing view acls groups to: 
2018-10-08 19:52:01 INFO  SecurityManager:54 - Changing modify acls groups to: 
2018-10-08 19:52:01 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  ith modify permissions: Set(root); groups with modify permissions: Set()
2018-10-08 19:52:02 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 35012.
2018-10-08 19:52:02 INFO  SparkEnv:54 - Registering MapOutputTracker
2018-10-08 19:52:03 INFO  SparkEnv:54 - Registering BlockManagerMaster
2018-10-08 19:52:03 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-10-08 19:52:03 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-10-08 19:52:03 INFO  DiskBlockManager:54 - Created local directory at /tmp/blockmgr-6e399656-ed6d-40aa-9c10-0259128b94e4
2018-10-08 19:52:03 INFO  MemoryStore:54 - MemoryStore started with capacity 366.3 MB
2018-10-08 19:52:03 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2018-10-08 19:52:03 INFO  log:192 - Logging initialized @5541ms
2018-10-08 19:52:03 INFO  Server:346 - jetty-9.3.z-SNAPSHOT
2018-10-08 19:52:04 INFO  Server:414 - Started @5888ms
2018-10-08 19:52:04 INFO  AbstractConnector:278 - Started ServerConnector@63fd4873{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-10-08 19:52:04 INFO  Utils:54 - Successfully started service 'SparkUI' on port 4040.
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5528a42c{/jobs,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@41c07648{/jobs/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1fe8d51b{/jobs/job,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22680f52{/jobs/job/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@60d84f61{/stages,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@39c11e6c{/stages/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@324dcd31{/stages/stage,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@433ffad1{/stages/stage/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1fc793c2{/stages/pool,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2575f671{/stages/pool/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@329a1243{/storage,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@ecf9fb3{/storage/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2d35442b{/storage/rdd,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@27f9e982{/storage/rdd/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4593ff34{/environment,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@37d3d232{/environment/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@30c0ccff{/executors,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@581d969c{/executors/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22db8f4{/executors/threadDump,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2b46a8c1{/executors/threadDump/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1d572e62{/static,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2d8f2f3a{/,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2024293c{/api,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5949eba8{/jobs/job/kill,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e0ff644{/stages/stage/kill,null,AVAILABLE,@Spark}
2018-10-08 19:52:04 INFO  SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://slaver1:4040
2018-10-08 19:52:04 INFO  SparkContext:54 - Added JAR file:/usr/local/spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar at spark://slaver1:35012/jars/spark-examples_211-2.3.1.jar with timestamp 1538999524828
2018-10-08 19:52:05 INFO  Executor:54 - Starting executor ID driver on host localhost
2018-10-08 19:52:05 INFO  Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41601.
2018-10-08 19:52:05 INFO  NettyBlockTransferService:54 - Server created on slaver1:41601
2018-10-08 19:52:05 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2018-10-08 19:52:05 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, slaver1, 41601, None)
2018-10-08 19:52:05 INFO  BlockManagerMasterEndpoint:54 - Registering block manager slaver1:41601 with 366.3 MB RAM, BlockManagerId(driver, slaver1, 41601, None)
2018-10-08 19:52:05 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, slaver1, 41601, None)
2018-10-08 19:52:05 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, slaver1, 41601, None)
2018-10-08 19:52:06 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6090f3ca{/metrics/json,null,AVAILABLE,@Spark}
2018-10-08 19:52:07 INFO  SparkContext:54 - Starting job: reduce at SparkPi.scala:38
2018-10-08 19:52:07 INFO  DAGScheduler:54 - Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions
2018-10-08 19:52:07 INFO  DAGScheduler:54 - Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
2018-10-08 19:52:07 INFO  DAGScheduler:54 - Parents of final stage: List()
2018-10-08 19:52:07 INFO  DAGScheduler:54 - Missing parents: List()
2018-10-08 19:52:07 INFO  DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
2018-10-08 19:52:09 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 366.3 MB)
2018-10-08 19:52:09 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1181.0 B, free 366.3 MB)
2018-10-08 19:52:09 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on slaver1:41601 (size: 1181.0 B, free: 366.3 MB)
2018-10-08 19:52:09 INFO  SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1039
2018-10-08 19:52:09 INFO  DAGScheduler:54 - Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1))
2018-10-08 19:52:09 INFO  TaskSchedulerImpl:54 - Adding task set 0.0 with 2 tasks
2018-10-08 19:52:09 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 7853 bytes)
2018-10-08 19:52:09 INFO  Executor:54 - Running task 0.0 in stage 0.0 (TID 0)
2018-10-08 19:52:09 INFO  Executor:54 - Fetching spark://slaver1:35012/jars/spark-examples_2.11-2.3.1.jar with timestamp 1538999524828
2018-10-08 19:52:10 INFO  TransportClientFactory:267 - Successfully created connection to slaver1/192.168.144.131:35012 after 141 ms (0 ms spent in bootstraps)
2018-10-08 19:52:10 INFO  Utils:54 - Fetching spark://slaver1:35012/jars/spark-examples_2.11-2.3.1.jar to /tmp/spark-c0f10a32-518d-466e-81e7-6a287f0a86ba/userFiles-5881d268-d800-4d3a-9b90-d892ceb9aaa/fetchFileTemp909421448294962008.tmp
2018-10-08 19:52:10 INFO  Executor:54 - Adding file:/tmp/spark-c0f10a32-518d-466e-81e7-6a287f0a86ba/userFiles-5881d268-d800-4d3a-9b90-d8929ceb9aaa/spark-examples_2.11-2.3.1.jar to class loader
2018-10-08 19:52:10 INFO  Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 824 bytes result sent to driver
2018-10-08 19:52:10 INFO  TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 7853 bytes)
2018-10-08 19:52:10 INFO  Executor:54 - Running task 1.0 in stage 0.0 (TID 1)
2018-10-08 19:52:10 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 1143 ms on localhost (executor driver) (1/2)
2018-10-08 19:52:10 INFO  Executor:54 - Finished task 1.0 in stage 0.0 (TID 1). 824 bytes result sent to driver
2018-10-08 19:52:10 INFO  TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 84 ms on localhost (executor driver) (2/2)
2018-10-08 19:52:10 INFO  TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2018-10-08 19:52:10 INFO  DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:38) finished in 2.866 s
2018-10-08 19:52:10 INFO  DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:38, took 3.180690 s
Pi is roughly 3.133915669578348
2018-10-08 19:52:11 INFO  AbstractConnector:318 - Stopped Spark@63fd4873{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-10-08 19:52:11 INFO  SparkUI:54 - Stopped Spark web UI at http://slaver1:4040
2018-10-08 19:52:11 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2018-10-08 19:52:11 INFO  MemoryStore:54 - MemoryStore cleared
2018-10-08 19:52:11 INFO  BlockManager:54 - BlockManager stopped
2018-10-08 19:52:11 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2018-10-08 19:52:11 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2018-10-08 19:52:11 INFO  SparkContext:54 - Successfully stopped SparkContext
2018-10-08 19:52:11 INFO  ShutdownHookManager:54 - Shutdown hook called
2018-10-08 19:52:11 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-c0f10a32-518d-466e-81e7-6a287f0a86ba
2018-10-08 19:52:11 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-24283d1b-7be7-4383-b6a8-8f58f8abed50

猜你喜欢

转载自blog.csdn.net/Source_00/article/details/82972499
今日推荐