版权声明:Copyright ◎ https://blog.csdn.net/huoliangwu/article/details/84663719
1 zookeeper的安装配置
1.1 ZK配置(安装过程省略)
1.1.1安装完成后进入到zk的安装目录下的conf目录
[hadoop@mini01 ~]$ cd /home/install/zookeeper/conf
[hadoop@mini01 conf]$
1.1.2重命名zoo_sample.cfg 为zoo.cfg
[hadoop@mini01 conf]$ mv zoo_sample.cfg zoo.cfg
1.1.3修改zoo.conf配置文件
[hadoop@mini01 conf]$ vi zoo.cfg
# example sakes.
# 设置zookeeper的数据存放路径
dataDir=/home/hadoop/install/zookeeper/data
..............
# 配置zookeeper集群地址 第一个端口用于选举leader
# 第二个端口用于leader宕机以后再次选举新的leader
server.1=192.168.13.128:2888:3888
server.2=192.168.13.129:2888:3888
server.3=192.168.13.131:2888:3888
1.1.4创建zookeeper的数据存放路径并将id写入到myid文件(需要手动创建myid文件)
[hadoop@mini01 conf]$ cd ../
[hadoop@mini01 zookeeper]$ mkdir data
[hadoop@mini01 zookeeper]$ echo 1 >> data/myid
1.2集群同步zookeeper文件夹(所有文件都会被同步)
[hadoop@mini01 zookeeper]$ cd ../
[hadoop@mini01 install]$ xsync zookeeper/
xsyc是个同步脚本,脚本内容详见
https://blog.csdn.net/huoliangwu/article/details/84591893
1.3分别启动集群上的zookeeper 并查看状态
[hadoop@mini01 zookeeper]$ ./bin/zkServer.sh start
[hadoop@mini01 zookeeper]$ ./bin/zkServer.sh status
下次写一个zookeeper集群启动的脚本
2 Kafka 安装配置
2.1 Kafka配置(安装过程省略)
2.1.1Kafka安装目录下创建logs目录
[hadoop@mini01 kafka]$ mkdir logs
2.1.2修改配置文件
[hadoop@mini01 kafka]$ cd config/
[hadoop@mini01 config]$ vi server.properties
#broker的全局唯一编号,不能重复
broker.id=0
#是否允许删除topic
delete.topic.enable=true
#处理网络请求的线程数量
num.network.threads=3
#用来处理磁盘IO的线程数量
num.io.threads=8
#发送套接字的缓冲区大小
socket.send.buffer.bytes=102400
#接收套接字的缓冲区大小
socket.receive.buffer.bytes=102400
#请求套接字的最大缓冲区大小
socket.request.max.bytes=104857600
#kafka运行日志存放的路径
log.dirs=/home/hadoop/install/kafka/logs
#topic在当前broker上的分区个数
num.partitions=1
#用来恢复和清理data下数据的线程数量
num.recovery.threads.per.data.dir=1
#segment文件保留的最长时间,超时将被删除
log.retention.hours=168
#配置连接Zookeeper集群地址
zookeeper.connect=mini01:2181,mini02:2181,mini03:2181
2.2分发Kafka安装后的目录
[hadoop@mini01 config]$ cd ../../
[hadoop@mini01 install]$ xsync kafka/
2.3分别修改集群其他机器上的配置文件 修改broker.id broker.id不得重复
mini02 broker.id=1
mini03 broker.id=2
2.4启动集群
[hadoop@mini01 kafka]$ bin/kafka-server-start.sh config/server.properties &
[hadoop@mini02 kafka]$ bin/kafka-server-start.sh config/server.properties &
[hadoop@mini03 kafka]$ bin/kafka-server-start.sh config/server.properties &
3 Flume安装配置
3.1 ZFlume配置(安装过程省略)
修改 flume-env.sh 配置文件,主要是JAVA_HOME变量设置
# during Flume startup.
# Enviroment variables can be set here.
export JAVA_HOME=/home/hadoop/install/jdk1.8.0_111.jdk/
3.2验证是否安装成功
[hadoop@mini03 flume]$ bin/flume-ng version
Flume 1.7.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 511d868555dd4d16e6ce4fedc72c2d1454546707
Compiled by bessbd on Wed Oct 12 20:51:10 CEST 2016
From source with checksum 0d21b3ffdc55a07e1d08875872c00523
出现提示便表示安装成功
Flume拉取文件数据到Kafka消费消息
新建flume配置文件 flume2kafka.conf
[hadoop@mini03 flume] vi conf/flume2kafka.conf
#定义了当前agent的名字叫做a1
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/hadoop/logs.tsv
a1.sources.r1.shell=/bin/sh -c
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
# 指定Flume sink
#a1.sinks.k1.type = logger
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = test
a1.sinks.k1.brokerList = mini01:9092
a1.sinks.k1.requiredAcks = 1
a1.sinks.k1.batchSize = 100
动态造数据
[hadoop@mini01 ~]$ test/getlog.sh
16991082028 余建堂 18401456522 杨占昊 46-08-04 15:07:46 1182
19641660102 刘洋 13059125383 郭振君 75-10-14 04:42:31 3926
14361606522 刘优 14692570569 陈猛 33-07-01 13:57:10 1700
17755364600 霍风浩 13059125383 郭振君 90-04-12 14:32:53 5587
15093813308 贾明灿 15060932038 闵强 90-02-11 04:42:22 1416
19641660102 刘洋 18506948961 冀缨菲 25-06-24 06:19:43 2622
15060932038 闵强 13305040991 高永斌 05-05-13 21:10:10 5015
13113007783 孙良明 14692570569 陈猛 94-08-12 03:35:48 3909
启动 flume
[hadoop@mini01 flume]$ bin/flume-ng agent -c conf -f conf/flume_kafka.conf --name a1 -Dflume.root.logger=INFO,console
启动kafka消费者
[hadoop@mini01 kafka]$ bin/kafka-console-consumer.sh --zookeeper mini01:2181 --topic test --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
13658626467 刘海涛 13288940364 贾鑫瑜 61-12-23 17:35:42 2615
19920594188 段雪鹏 17755364600 霍风浩 74-11-18 17:42:22 6740
17533432302 张文举 14865818526 常天罡 05-05-12 06:43:29 2569
15142556083 赵晓露 18491428393 张苗 51-02-03 20:39:20 2719
13305040991 高永斌 14692570569 陈猛 75-08-08 18:50:56 5506
19641660102 刘洋 14385342683 陈凯 49-09-22 04:50:07 3719