Spark's first program

1. Installation

Linux environment
uses pyspark and jupyter notebook as interactive tools.
See Spark Getting Started specifically .

2. The first program

Calculate the pi:

import random
num_samples = 100000000

def inside(p):     
  x, y = random.random(), random.random()
  return x*x + y*y < 1

count = sc.parallelize(range(0, num_samples)).filter(inside).count()

pi = 4 * count / num_samples
print(pi)

sc.stop()

operation result:

3.1417056

reference:

  1. How to install PySpark and Jupyter Notebook in 3 Minutes
Published 513 original articles · Like 152 · Visit 770,000+

Guess you like

Origin blog.csdn.net/rosefun96/article/details/105490482