Ubuntu下spark安装

先安装jdk:
sudo apt-get update
sudo apt-get install default-jre
sudo apt-get install openjdk-7-jdk
然后运行java -version 查看是否安装成功
下载spark:
最新版本为spark-2.0.2-bin-hadoop2.7.tgz
然后解压 tar -xvf spark-2.0.2-bin-hadoop2.7.tgz
移动到/opt目录 mv spark-2.0.2-bin-hadoop2.7/ /opt
设置环境变量
echo "export PATH=/opt/spark-2.0.2-bin-hadoop2.7/bin:$PATH" >> ~/.bashrc
source ~/.bashrc
修改spark配置文件
cd /opt/spark-2.0.2-bin-hadoop2.7/conf/
cp log4j.properties.template log4j.properties
cp spark-evn.sh.template spark-evn.sh
修改spark-evn.sh
sudo vim spark-evn.sh
添加以下几行
export SPARK_HOME=/opt/spark-2.0.2-bin-hadoop2.7/
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386
export SPARK_MASTER_IP=192.168.18.130
export SPARK_WORKER_MEMORY=256m
我在虚拟机上测试设置内存为256m,具体路径根据实际情况修改
对应python 将python目录下pyspark目录和文件拷贝到 site-packages下
copy file /Path_spark/python/pyspark to /your_python_Lib_path/site-packages
注意:
在~/.bashrc中设置的系统环境只能在terminal shell下运行spark程序才有效,因为.bashrc is only read for interactive shells.
如果要在当前用户整个系统中都有效(包括pycharm等等IDE),需要把环境变量设置在/etc/environment中,例如:
export SPARK_HOME=/opt/spark
PATH="$SPARK_HOME/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games"

猜你喜欢

转载自strayly.iteye.com/blog/2339717