Ubuntu 16.04 + PyCharm + spark 运行环境配置

0. 安装PyCharm和spark

      下载pycharm   http://www.jetbrains.com/pycharm/

      下载spark          http://spark.apache.org/

       ps:在安装pycharm前系统需要有java环境

1.安装py4j

       $ sudo pip install py4j

2.配置pycharm

       在Run/Debug Configurations中 如下图配置

 



然后就可以在pycharm中运行pyspark的程序了

测试一下:


from pyspark import SparkContext

sc = SparkContext()

logData = sc.textFile("README.md").cache()

numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()

print("Lines with a: %i, lines with b: %i" % (numAs, numBs))

运行结果



猜你喜欢

转载自blog.csdn.net/u012269327/article/details/72982598
今日推荐