Introduction: Deployment mode=Run mode=Spark can run under those conditions, applications written in Spark framework can run in local mode (Local Mode), cluster mode (Cluster Mode) and cloud service (Cloud), which is convenient for development, testing and production deploy.
- 1-local mode-essentially a local single-machine multi-threaded method:
local mode LocalMode operation: start a JVM process, run Task tasks in it, the number of Task tasks running in parallel depends on the number of assigned CPU Core cores, in the Spark application, a Task task To run, 1Core CPU is required.
1), Method 1: local
means to allocate 1 Core CPU to run
2), Method 2: local[2]
means to allocate K Core CPU and run K Tasks at the same time
3), Method 3: local[*]
means to obtain the physical machine CPU Core core number
Installation of Spark local mode:
1. Upload the installation package and unzip the installation package:
解压软件包
tar -zxvf spark-2.4.5-bin-hadoop2.7.tgz
创建软连接,方便后期升级
ln -s /export/server/spark-2.4.5-bin-hadoop2.7 /export/server/spark
如果有权限问题,可以修改为root,方便学习时操作,实际中使用运维分配的用户和权限即可
chown -R root /export/server/spark-2.4.5-bin-hadoop2.7
chgrp -R root /export/server/spark-2.4.5-bin-hadoop2.7
//chgrp 允许普通用户改变文件所属的组为root
Directory introduction:
2. Run Spark shell
开箱即用
直接启动bin目录下的spark-shell:
进入Spark安装目录
cd /export/server/spark
##直接使用spark-shell,默认使用local[*]
bin/spark-shell
## 或
bin/spark-shell --master local[2]
3 spark-shell description
1.直接使用./spark-shell
表示使用local 模式启动,在本机启动一个SparkSubmit进程
2.还可指定参数 --master,如:
spark-shell --master local[N] 表示在本地模拟N个线程来运行当前任务
spark-shell --master local[*] 表示使用当前机器上所有可用的资源
3.不携带参数默认就是
spark-shell --master local[*]
4.后续还可以使用--master指定集群地址,表示把任务提交到集群上运行,如
./spark-shell --master spark://node01:7077,node02:7077
5.退出spark-shell
使用 :quit
- Monitoring page: http://node1:4040/jobs/