News real-time analysis system Spark2.X cluster operation mode

1. Introduction of several operating modes

Spark several operating modes:

1)Local

2)Standalone

3)Yarn

4) Months

Download IDEA and install, can Baidu free documentation.

2.spark Standalone mode configuration and test

1) jdk1.8 installed

2) scala2.11.8 already installed

3) Hadoop2.5.0 already installed

4) Installation Spark Standalone

a) placement slave

we are slaves

bigdata-pro01.kfk.com

bigdata-pro02.kfk.com

bigdata-pro03.kfk.com

b) Configuration spark-env.sh

we spark-env.sh

export JAVA_HOME=/opt/modules/jdk1.8.0_11

export SCALA_HOME=/opt/modules/scala-2.11.8

 

SPARK_CONF_DIR=/opt/modules/spark-2.2.0-bin/conf

SPARK_MASTER_HOST = bigdata-pro02.kfk.com

SPARK_MASTER_PORT=7077

SPARK_MASTER_WEBUI_PORT=8080

SPARK_WORKER_CORES=1

SPARK_WORKER_MEMORY=1g

SPARK_WORKER_PORT=7078

SPARK_WORKER_WEBUI_PORT=8081

c) The spark arranged to distribute to other nodes for each node and modify the specific configuration

scp -r spark-2.2.0-bin bigdata-pro01.kfk.com:/opt/modules/

scp -r spark-2.2.0-bin bigdata-pro03.kfk.com:/opt/modules/

d) starting spark

sbin/start-all.sh

e) Client Test

bin/spark-shell --master spark://bigdata-pro02.kfk.com:7077

d) cluster running

bin/spark-submit --master spark://bigdata-pro02.kfk.com:7077 --deploy-mode cluster /opt/jars/sparkStu.jar hdfs://bigdata-pro01.kfk.com:9000/user/data/stu.txt hdfs://bigdata-pro01.kfk.com:9000/user/data/output

3.spark on yarn mode configuration and test

1) Note hadoop configuration file jdk version is consistent with the current version of jdk

2) spark on yarn patterns submit jobs

bin/spark-submit --class com.spark.test.Test --master yarn --deploy-mode cluster /opt/jars/sparkStu.jar hdfs://bigdata

Guess you like

Origin www.cnblogs.com/misliu/p/11482388.html
Recommended