机器学习部分:Python版的Wordcount【Python版代码】

 源数据:

hello spark
hello python
hello java
hello word
hello spark
hello python
hello java
hello word
hello spark
hello python
hello java
hello word
hello spark
hello python
hello java
hello word
hello spark
hello python
hello java
hello word

Python代码: 

#conding:utf-8
from pyspark.conf import SparkConf
from pyspark.context import SparkContext

def show(one):
    print(one)
    
if __name__ == '__main__':
    conf = SparkConf()
    conf.setAppName("test")
    conf.setMaster("local")
    sc=SparkContext(conf=conf)
    lines = sc.textFile("./wc")
    words = lines.flatMap(lambda line:line.split(" "))
    pairwords = words.map(lambda word:(word,1))
    result=pairwords.reduceByKey(lambda v1,v2:v1+v2)
    result.foreach(lambda one:show(one))

猜你喜欢

转载自blog.csdn.net/wyqwilliam/article/details/81636856