window配置安装pyspark

window配置安装pyspark,值得信赖的教程:win10下安装pyspark
,按照教程安装并配置好后,在cmd窗口下输入pyspark即可运行;但是,在pycharm下的黑窗口下输入pyspark,会报错:c:\Windows\System32>pyspark
Java not found and JAVA_HOME environment variable is not set.
Install Java and set JAVA_HOME to point to the Java installation directory.;待解决;

解决方法,增加如下:

import sys
import os
os.environ['SPARK_HOME'] = "E:/1/spark-2.3.0-bin-hadoop2.7/spark-2.3.0-bin-hadoop2.7"
os.environ['HADOOP_HOME'] = "E:/1/hadoop-2.7.1/hadoop-2.7.1"
sys.path.append("C:/Users/tsl/AppData/Local/Programs/Python/Python35/Lib")
sys.path.append("C:/Users/tsl/AppData/Local/Programs/Python/Python35")



下面这样写环境变量也可以:
import sys
import os
os.environ['SPARK_HOME'] = "C:/Users/tsl/AppData/Local/Programs/Python/Python35/Lib/site-packages/pyspark"
os.environ['HADOOP_HOME'] = "E:/1/hadoop-2.7.1/hadoop-2.7.1/bin/winutils/bin"
sys.path.append("C:/Users/tsl/AppData/Local/Programs/Python/Python35/Lib")
sys.path.append("C:/Users/tsl/AppData/Local/Programs/Python/Python35")
pyspark
<module 'pyspark' from 'C:\\Users\\tsl\\AppData\\Local\\Programs\\Python\\Python35\\lib\\site-packages\\pyspark\\__init__.py'>

先进入python ,再进入spark,可是这个没有直接进入pyspark有效!(待解决)
注意:.environ[‘SPARK_HOME’] = “E:/1/spark-2.3.0-bin-hadoop2.7/spark-2.3.0-bin-hadoop2.7”,这个路径是最终确定的有效路径,报错: raise Exception(“Java gateway process exited before sending the driver its port number”)
Exception: Java gateway process exited before sending the driver its port number
如果用:os.environ[‘SPARK_HOME’] = “C:/Users/tsl/AppData/Local/Programs/Python/Python35/Lib/site-packages/pyspark”
报错为:FileNotFoundError: [WinError 2] 系统找不到指定的文件。
,注意教程pycharm在windows下配置,在C:\Windows\System32\cmd.exe,以管理员权限打开,然后再运行E:\1\hadoop-2.7.1\hadoop-2.7.1\bin\winutils\bin\winutils.exe chmod 777 C:\tmp\hive,退出,再以管理员员身份进入cmd,输入pyspark,两个窗口区别:是否是以管理员身份打开的;

猜你喜欢

转载自blog.csdn.net/sinat_26566137/article/details/80205938
今日推荐