In order to facilitate the use of reason Spark students submit tasks and task management, strengthen, through research using more reliable Livy, the following figure roughly submit a list of what several different platforms.
This paper will be based on stand-alone environment to build a mac Spark + Livy + Hadoop to show how to submit the task to run, this article only to run up against the frame, as to how to configure various component parameters, better performance, you brothers to find their own degree of your mother.
A. Building Spark
Access http://spark.apache.org/downloads.html download the installation package, and then follow the steps below.
1. After the download is complete extract to a directory, execute the following command in the directory
tar zxvf spark-2.1.0-hadoop2.7.tgz
2. Configure environment variables Spark
Mac environment variable is usually in the / etc / profile configuration, open the profile file to add the file.
#SPARK VARIABLES START
export SPARK_HOME =/usr/local/spark-2.1.0-hadoop2.7
export PATH = ${PATH}:${SPARK_HOME}/bin
#SPARK VARIABLES END
3. Configure Java environment
Also disposed in the / etc / profile. Prior to this already installed jdk scala and related conditions. Add the java installation directory to which the export JAVA_HOME = / Library / java / javaVirtualMachines / jdk1.8.0_111.jdk / COntents / Home setup is complete, save and exit, last used source / etc / profile so that the environment variables to take effect.
4. Perform sbin / start-all.sh, spark start
5. Test
Open the terminal input pyspark, i.e., the following screen appears showing the installation was successful.
Two .Livy installation
1. https://www.apache.org/dyn/closer.lua/incubator/livy/0.6.0-incubating/apache-livy-0.6.0-incubating-bin.zip to download the installation package.
2. Extract the installation package, and edit livy.conf, add parameters to the red box.
Then edit livy-env.sh, adding Spark installation directory configuration.
3. Finally bin / livy-server start to start Livy
Three .Hadoop installation
1. https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/ to download the package and extract
2. Modify various configurations, vim core-site.xml, modify
To modify hdfs-site.xml
3. Set Environment Variables
export HADOOP_HOME=/User/deploy/software/hadoop/hadoop-2.8.5
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
4. Format node
5. Do sbin / start-all.sh start Hadoop, the following screen appears successful installation instructions
yarn and it is one of the hadoop installation, access to determine whether the installation was successful yarn
6. If datanode not start copying current / VERSION in the success, went to the configuration of the clusterID to NameNode VERSION can under DataNode.
四.开发代码提交任务
通过上面的步骤基础环境就已经搭建好,接着就是开发接口提交任务。部分代码截图如下:
五.关注公众号获取源码