版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/u012292754/article/details/83614821
1 local 模式
直接运行即可
2 Standalone 模式
和 Hadoop/HDFS 的架构类似
/home/hadoop/apps/spark-2.1.3-bin-2.6.0-cdh5.7.0/conf
2.1 spark-env.sh
SPARK_MASTER_HOST=node1
SPARK_WORKER_CORES=1
SPARK_WORKER_MEMORY=1g
SPARK_WORKER_INSTANCES=1
2.2 启动
[hadoop@node1 ~]$ /home/hadoop/apps/spark-2.1.3-bin-2.6.0-cdh5.7.0/sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /home/hadoop/apps/spark-2.1.3-bin-2.6.0-cdh5.7.0/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-node1.out
localhost: starting org.apache.spark.deploy.worker.Worker, logging to /home/hadoop/apps/spark-2.1.3-bin-2.6.0-cdh5.7.0/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-node1.out
[hadoop@node1 ~]$ jps
1442 Master
1596 Jps
1534 Worker
[hadoop@node1 ~]$
[hadoop@node1 ~]$ cat /home/hadoop/apps/spark-2.1.3-bin-2.6.0-cdh5.7.0/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-node1.out
Spark Command: /usr/apps/jdk1.8.0_181-amd64/bin/java -cp /home/hadoop/apps/spark-2.1.3-bin-2.6.0-cdh5.7.0/conf/:/home/hadoop/apps/spark-2.1.3-bin-2.6.0-cdh5.7.0/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host node1 --port 7077 --webui-port 8080
========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/11/01 09:09:30 INFO Master: Started daemon with process name: 1442@node1
18/11/01 09:09:30 INFO SignalUtils: Registered signal handler for TERM
18/11/01 09:09:30 INFO SignalUtils: Registered signal handler for HUP
18/11/01 09:09:30 INFO SignalUtils: Registered signal handler for INT
18/11/01 09:09:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/11/01 09:09:31 INFO SecurityManager: Changing view acls to: hadoop
18/11/01 09:09:31 INFO SecurityManager: Changing modify acls to: hadoop
18/11/01 09:09:31 INFO SecurityManager: Changing view acls groups to:
18/11/01 09:09:31 INFO SecurityManager: Changing modify acls groups to:
18/11/01 09:09:31 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set()
18/11/01 09:09:31 INFO Utils: Successfully started service 'sparkMaster' on port 7077.
18/11/01 09:09:32 INFO Master: Starting Spark master at spark://node1:7077
18/11/01 09:09:32 INFO Master: Running Spark version 2.1.3
18/11/01 09:09:32 INFO Utils: Successfully started service 'MasterUI' on port 8080.
18/11/01 09:09:32 INFO MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://192.168.30.131:8080
18/11/01 09:09:32 INFO Utils: Successfully started service on port 6066.
18/11/01 09:09:32 INFO StandaloneRestServer: Started REST server for submitting applications on port 6066
18/11/01 09:09:33 INFO Master: I have been elected leader! New state: ALIVE
18/11/01 09:09:36 INFO Master: Registering worker 192.168.30.131:32865 with 1 cores, 1024.0 MB RAM
浏览器访问 http://node1:8080/
3 wordcount
3.1 启动spark shell
[hadoop@node1 spark-2.1.3-bin-2.6.0-cdh5.7.0]$ ./bin/spark-shell --master spark://node1:7077
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/11/01 09:36:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/11/01 09:36:38 WARN SparkConf:
SPARK_WORKER_INSTANCES was detected (set to '1').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with --num-executors to specify the number of executors
- Or set SPARK_EXECUTOR_INSTANCES
- spark.executor.instances to configure the number of instances in the spark config.
Spark context Web UI available at http://192.168.30.131:4040
Spark context available as 'sc' (master = spark://node1:7077, app id = app-20181101093639-0000).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.3
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181)
Type in expressions to have them evaluated.
Type :help for more information.
scala>
scala> val file = sc.textFile("hdfs://node1:8020/words.txt")
file: org.apache.spark.rdd.RDD[String] = hdfs://node1:8020/words.txt MapPartitionsRDD[5] at textFile at <console>:24
scala> val res = file.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_)
res: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[8] at reduceByKey at <console>:26
scala> res.collect
res0: Array[(String, Int)] = Array((tom,3), (hello,2), (world,1), (jack,2), (mary,2))