(D) Spark deploying Docker build large data clusters

main content

  • spark deployment

premise

  • zookeeper normal use
  • JAVA_HOME environment variable
  • HADOOP_HOME environment variables

Installation package

Micro Cloud Download | tar package directory under

  • Spark2.4.4

First, prepare the environment

Upload image to docker

docker cp spark-2.4.4-bin-hadoop2.7.tar.gz cluster-master:/root/tar

Decompression

tar xivf spark-2.4.4-bin-hadoop2.7.tar.gz -C /opt/hadoop

Second, the configuration file

spark-env.sh

SPARK_LOCAL_DIRS=/opt/spark/spark-2.4.4-bin-hadoop2.7
HADOOP_CONF_DIR=/opt/hadoop/hadoop-2.7.7/etc/hadoop
YARN_CONF_DIR=/opt/hadoop/hadoop-2.7.7/etc/hadoop
JAVA_HOME=/opt/jdk/jdk1.8.0_221
export SPARK_MASTER_IP=cluster-master
export SPARK_DAEMON_JAVA_OPTS="
-Dspark.deploy.recoveryMode=ZOOKEEPER
-Dspark.deploy.zookeeper.url=172.15.0.2:2181
-Dspark.deploy.zookeeper.dir=/sparkmaster"

slaves

cluster-slave1
cluster-slave2
cluster-slave3

spark-default.conf

spark.eventLog.enabled          true
spark.eventLog.dir              hdfs://jinbill/spark/eventLog
spark.history.fs.logDirectory   hdfs://jinbill/spark/eventLog
spark.eventLog.compress         true

Third, start

start-all.sh

Four, UI interface

Because different network segments, so we have to add routes to access

  1. Open cmd, requires administrator privileges
  2. route add 172.15.0.0 mask 255.255.0.0 192.168.11.38 -p

Spark Master access address
Spark Slave1 access address
Spark Slave2 access address
Spark Slave3 access address
Spark historic task access address

Guess you like

Origin www.cnblogs.com/njpkhuan/p/11611951.html