Spark-胡乱小记

1.从hdfs文件中获取数据

    val hdfs=org.apache.hadoop.fs.FileSystem.get(new java.net.URI("hdfs://hacluster"),
        new org.apache.hadoop.conf.Configuration())
    val fSDataInputStream1=hdfs.open(new Path(hdfs://hacluster/A/B/test.txt))
    val bufferedReader1=new BufferedReader(new InputStream(fSDataInputStream1))
    val line=bufferedReader1.readLine()

2.定义创建ssc函数

     val sc = SparkContext.getOrCreate()
     def funCreateStreamingContext():StreamingContext={
          val newSsc= new StreamingContext(sc,Seconds(60))
          println("Creating new StreamingContext")
          newSsc.chekpoint(vCheckPoint)
          newSsc
     }

3.创建ssc

    val checkPointPath ="hdfs://hacluster/A/B/checkPointPath"
    val ssc=StreamingContext.getActiveOrCreate(checkPointPath ,funCreateStreamingContext)
发布了53 篇原创文章 · 获赞 40 · 访问量 4万+

猜你喜欢

转载自blog.csdn.net/u012761191/article/details/81222247