jupyter notebook configure pyspark

我们通常会在远程服务器部署spark环境,并且安装python和jupyter notebook。之后通本地浏览器连接远程服务器地址,进行本地开发。

在安装完上述工具之后,在服务器端输入命令jupyter notebook --allow-root启动notebook服务。本地打开浏览器输入服务器地址,如http://127.0.0.1:8888,8888为配置服务器notebook时的端口。

Start local programming
Remember install the findspark package for easy configuration of pyspark
import findspark
findspark.init('hdp/2.5.0.0/spark installation directory', edit_rc=True)

import pyspark

sc = pyspark.SparkContext (appName = 'wordsCount')

If you want to draw and display on the notebook,
enter the command: %matplotlib inline

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325856194&siteId=291194637