kafka 随记

1.查看版本兼容问题
  https://www.cloudera.com/documentation/enterprise/release-notes/topics/rn_consolidated_pcm.html#pcm_kafka
  
2.下载CSD包KAFKA-1.2.0.jar
   http://archive.cloudera.com/csds/kafka/
   上传到/opt/cloudera/csd 目录下
   
3.下载KAFKA-3.0.0-1.3.0.0.p0.40-el6.parcel 和 KAFKA-3.0.0-1.3.0.0.p0.40-el6.parcel.sha1、manifest.json
  http://archive.cloudera.com/kafka/parcels/latest/
  上传到/opt/cloudera/parcel-repo 目录下

##创建topic replication-factor 副本数
kafka-topics --create --topic kafka-test --zookeeper host0:2181,host2:2181,host7:2181 --partitions 1 --replication-factor 1
#列出topic
kafka-topics --list --zookeeper host0:2181,host2:2181,host7:2181
#查看topic
kafka-topics --zookeeper host0:2181,host2:2181,host7:2181 --topic "direct_test" --describe
#启动kafka 生产者
kafka-console-producer --broker-list host0:9092,host2:9092,host7:9092 --topic direct_test
##启动消费者
kafka-console-consumer --zookeeper host0:2181,host2:2181,host7:2181 --topic direct_test
##删除topic
kafka-topics --delete --zookeeper host0:2181,host2:2181,host7:2181 --topic direct_test
##kafka 修改分区数量
kafka-topics --alter --zookeeper host0:2181,host2:2181,host7:2181  --partitions 5 --topic flink_hbase
kafka-topics --alter --zookeeper host0:2181,host2:2181,host7:2181  --replication-factor 3 --topic flink_hbase    
##查询offset的最小值:
kafka-run-class kafka.tools.GetOffsetShell --broker-list host0:9092,host2:9092,host7:9092 -topic flink_hbase --time -2
##查询offset的最大值:
kafka-run-class kafka.tools.GetOffsetShell --broker-list host0:9092,host2:9092,host7:9092 -topic flink_hbase --time -1
##kafka查看topic各个分区的消息的信息
kafka-run-class kafka.tools.ConsumerOffsetChecker  --group client_kafka-test_0 --topic kafka-test  --zookeeper host0:2181,host2:2181,host7:2181/
##删除group
ls  /consumers
rmr  /consumers/group1

通过下面命令设置consumer group:DynamicRangeGroup topic:DynamicRange partition:0的offset为1288:
set /consumers/DynamicRangeGroup/offsets/DynamicRange/0 1288

spark-shell --master local[2] -jars /root/tanj/kafka_2.11-0.8.2.1.jar,/root/tanj/spark-streaming-kafka_2.11-1.6.0.jar

import org.apache.spark.SparkConf
import kafka.serializer.StringDecoder
import org.apache.spark.streaming.kafka.KafkaUtils
import org.apache.spark.streaming.{Durations, StreamingContext}

val ssc = new StreamingContext(sc, Durations.seconds(5))
KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc
  , Map("bootstrap.servers" -> "host0:2181,host2:2181,host7:2181"
  , "metadata.broker.list" -> "host0:9092,host2:9092,host7:9092"
  , "group.id" -> "StreamingWordCountSelfKafkaDirectStreamScala")
  , Set("direct_test")).map(t => t._2).flatMap(_.toString.split(" ")).map((_, 1)).reduceByKey(_ + _).print()
ssc.start()

学习资料:
http://blog.csdn.net/sun_qiangwei/article/details/52080147#t2
http://blog.csdn.net/qq_21234493/article/details/51340138

猜你喜欢

转载自my.oschina.net/u/2510243/blog/1796730