Python grammar - pyspark actual combat (basic knowledge)

Python grammar - pyspark actual combat (basic knowledge)

insert image description here
Demonstration of obtaining the execution environment storage object of pyspark: SparkContext

"""
演示获取pyspark的执行环境入库对象:SparkContext
并通过SparkContext对象获取当前的pyspark的版本
"""

# 导包
from pyspark import SparkConf, SparkContext

# 创建sparkconf类对象
# conf = SparkConf()
# conf.setMaster("local[*]")
# conf.setAppName("test_name")
conf = SparkConf().setMaster("local[*]").setAppName("test_spark_app")

# 基于sparkconf类对象创建sparkcontext对象
sc = SparkContext(conf=conf)

# 打印pyspark版本
print(sc.version)

# 停止sparkcontext对象的运行(停止pyspark程序)
sc.stop()

Problem encountered in the first execution : RuntimeError: Java gateway process exited before sending its port number when setting up the PySpark execution environment entry
Reason : Java jdk program is not installed
Solution : Go to the official website to download jdk, install and configure it, restart pycharm, and solve the problem The result of the post procedure is as follows
insert image description here

Reference content:
RuntimeError: Java gateway process exited before sending its port number occurs when Python builds the PySpark execution environment entry
Dark Horse Programmer-Python Basics

Guess you like

Origin blog.csdn.net/qq_45833373/article/details/131254493