Flume和kafka的结合实战

1.修改flume的配置文档exec-memory-kafka.conf ,使用kafka sink

# Name the components on this agent
a1.sources = r1  #a1代表agent名称,r1:数据源的名称
a1.sinks = k1    #k1 sink名称

a1.channels = c1  #c1 channel名称

# Describe/configure the source 不同的source配置信息请查看官网
a1.sources.r1.type = exec #设置a1中,r1这个source的类型,一个agent可能有多个source
a1.sources.r1.command = tail -F /home/hadoop/data/data.log  #指定linux命令
a1.sources.r1.shell = /bin/sh -c 

# Describe the sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
al.sinks.k1.brokerlist=10.96.183.35:9092
al.sinks.k1.topic=hello_topic
al.sinks.k1.batchSize=5

al.sinks.k1.requiredAcks=1

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

2.启动flume

flume-ng agent --name a1 --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/exec-memory-kafka.conf \

-Dflume.root.logger=INFO,console

3.启动kafka的一个consumer进行消费

kafka-console-consumer.sh --zookeeper:localhost:2181 --topic hello_topic

4.数据测试

在data.log中添加新数据,测试kafka是有对数据进行消费。

注意:al.sinks.k1.batchSize参数用来设置数据到达多条才发送给kafka进行消费。




猜你喜欢

转载自blog.csdn.net/fengfengchen95/article/details/80398417