spark Streaming flume poll 坑

1.flume的conf,

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# source
a1.sources.r1.type = netcat
a1.sources.r1.bind= localhost
a1.sources.r1.port = 9999

# Describe the sink
a1.sinks.k1.type = org.apache.spark.streaming.flume.sink.SparkSink
# 运行flume的ip
a1.sinks.k1.hostname = 192.168.25.145
a1.sinks.k1.port = 8888

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channela1.sources.r1.channels = c1a1.sinks.k1.channel = c1
 
 

2.jar包准备

参考官方文档: http://spark.apache.org/docs/latest/streaming-flume-integration.html

当前测试flume使用到的jar包版本如下:

spark-streaming-flume-sink_2.11-2.2.0.jar
scala-library-2.11.8.jar
commons-lang3-3.5.jar

这几个jar包下载后放到flume安装目录 ./flume/lib/ 中。

spark streaming用到的jar版本如下:

spark-streaming-flume-assembly_2.11-2.2.0.jar


启动测试:

hadoop@1:/usr/local/flume$ bin/flume-ng agent -c conf -f conf/flume-spark.conf -n a1 -Dflume.root.logger=DEBUG,console

遇到:

Unsupported major.minor version 52.0 
这个错误为jdk版本问题,flume-env.sh中修改为对应版本即可. 
jdk8-52 
jdk7-51

package com.imooc.spark

import org.apache.spark.SparkConf
import org.apache.spark.streaming.flume.FlumeUtils
import org.apache.spark.streaming.{Seconds, StreamingContext}

/**
  * Spark Streaming整合Flume的第二种方式
  */
object FlumePullWordCount {

  def main(args: Array[String]): Unit = {

    if(args.length != 2) {
      System.err.println("Usage: FlumePullWordCount <hostname> <port>")
      System.exit(1)
    }

    val Array(hostname, port) = args

    val sparkConf = new SparkConf() //.setMaster("local[2]").setAppName("FlumePullWordCount")
    val ssc = new StreamingContext(sparkConf, Seconds(5))

    //TODO... 如何使用SparkStreaming整合Flume
    val flumeStream = FlumeUtils.createPollingStream(ssc, hostname, port.toInt)

    flumeStream.map(x=> new String(x.event.getBody.array()).trim)
      .flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).print()

    ssc.start()
    ssc.awaitTermination()
  }
}

报错
> 1,java.lang.IllegalStateException: begin() called when transaction is OPEN!
解决方法:
    flume中多出来的scala-library版本,删除非当前的

> 2,no further information flume streaming
解决方法:
    flume的连接问题,flume出问题,没有正常运行


猜你喜欢

转载自blog.csdn.net/Emperor_CJ/article/details/80263670
今日推荐