01 Introduction
ELK in a production environment using the default configuration requires a middleware for asynchronous collect logs. We use kafka, it can provide a unified data, swallow swallow high volume, low-latency network platform, which will not solve logstash message queue caching, the disadvantage of data loss risks.
The official recommended to use Redis as middleware, specifically to see this collection of personal data integrity, data over kafka contrast Redis is relatively safe.
This switched my personal public number: Tianmu Star, please a lot of attention.
First, the software architecture diagram used
linux:CentOS 7.5.1804
Kafka: kafka_2.11-2.2.0
Zookeeper:zookeeper-3.4.14
Second, install the software
Prerequisite: you need to install java, recommended java8
1, the installation zookeeper
We use three hosts to build zookeeper cluster
Decompression: tar xvf zookeeper-3.4.14.tar.gz
Installation path: cp zookeeper-3.4.14 / usr / local / zookeeper
zookeeper profile zoo_sample.cfg need to rename zoo.cfg
Modify the configuration file (the same three host configuration)
# vim /usr/local/zookeeper/conf/zoo.cfg
tickTime=2000 # 心跳时间单位
initLimit=10 # follower超时时间,表示10*2000=20秒
syncLimit=5 # leader与follower的应答时间,5*2000=10秒
dataDir=/data/zookeeper # 数据保存路径,建议规划好
dataLogDir=/data/zookeeper/log # 日志路径
clientPort=2181 # 客户端访问Zookeeper服务器的端口
# server.A=B:C:D中的A是一个数字,表示这个是第几号服务器;B是这个服务器的IP地址;
# C是集群中的leader服务器交换信息的端口;
# D是在leader挂掉时专门用来进行选举leader所用的端口。
server.1=localhost:2888:3888
server.2=localhost:2888:3889
server.3=localhost:2888:3890
Creating ServerID logo
zookeeper cluster mode requires a myid configuration file, the file needs to be in dataDir directory, which is the value of the configuration file zoo.cfg server.A = B: C: D in the value of A.
# 三台服务器都需要配置,A值根据zoo.cfg情况配置
echo "1" > /data/zookeeper/myid
Configuration environment variable:
Convenient configuration variables post-start service
cat >>/etc/profile <<EOF
export ZOOKEEPER_HOME=/opt/zookeeper
export PATH={$PATH}:{$ZOOKEEPER_HOME}/bin
EOF
2, installation kafka
Kafka continue the installation on each host After installing the zookeeper
Decompression: tar xvf kafka_2.11-2.2.0.tgz
Installation path: cp kafka_2.11-2.2.0 / usr / local / kafka
Modify the configuration file
# vim /usr/local/kafka/config/server.properties
broker.id=1 # 集群唯一ID,另外l两台机依次设置为2,3
listeners=PLAINTEXT://:9092 #监听的端口
num.network.threads=3 #broker处理消息的最大线程数
num.io.threads=8 #broker处理磁盘IO的线程数,数值最好大于机器的硬盘数
socket.send.buffer.bytes=102400 #socket的发送缓冲区大小
socket.receive.buffer.bytes=102400 #socket的接受缓冲区
socket.request.max.bytes=104857600 #socket请求的最大数值
log.dirs=/tmp/kafka-logs #kafka存放日志的路径
num.partitions=1 #每个topic的分区个数
num.recovery.threads.per.data.dir=1 #启动时日志恢复的每个数据目录中的线程数
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168 #数据存储的最大时间
log.segment.bytes=1073741824 #topic的分区大小
log.retention.check.interval.ms=300000 #文件大小检查的周期时间
zookeeper.connect=localhost:2181 #zookeeper集群的地址,多个之间用逗号分割
zookeeper.connection.timeout.ms=6000 #ZooKeeper的连接超时时间
group.initial.rebalance.delay.ms=0 #组织协调时间,生产用建议设置等于3
delete.topic.enable=true
Third, start with test
1, write a startup script
zookeeper cluster startup script
#!/bin/bash
# Auther: Gordon Luo
# Date: 2019-4-25 11:30
# Function: 用于zookeeper集群启动使用
#### zookeeper_cluster.sh ####
APPHOME="/usr/local/zookeeper"
ZKS="node1 node2 node3"
APP_NAME="zookeeper"
if [ $# -ne 1 ];then
echo "Usage zookeeper.sh {start|stop|status}"
exit 1
fi
#echo OK
for i in $ZKS;do
echo "Start ${APP_NAME} on ${i}"
ssh ${i} "source /etc/profile; sh ${APPHOME}/bin/zkServer.sh $1"
if [ $? -ne 0 ];then
echo "Start ${APP_NAME} on ${i} is ok"
fi
done
echo ALL ${ZKS} are $1
exit 0
kafka cluster startup script
#!/bin/bash
# Auther: Gordon Luo
# Date: 2019-4-25 15:32
# Function: 用于Kafka集群启动使用
#### kafka_cluster.sh ####
APP_HOME="/usr/local/kafka"
KAFKAS="node1 node2 node3"
APP_NAME="kafka"
function kafka_start()
{
for i in ${KAFKAS};do
ssh ${i} "source /etc/profile;${APP_HOME}/bin/kafka-server-start.sh -daemon ${APP_HOME}/config/server.properties"
done
}
function kafka_stop()
{
for i in ${KAFKA_CFGNAME};do
ssh localhost " source /etc/profile;${APP_HOME}/bin/kafka-server-stop.sh"
done
}
case $1 in
start)
kafka_start
;;
stop)
kafka_stop
;;
*)
echo "Usage: kafka.sh {start|stop}"
esac
2, start the service
Start with kafka cluster zookeeper
sh zookeeper_cluster.sh start
sh kafka_cluster.sh start
Close cluster
sh zookeeper_cluster.sh stop
sh kafka_cluster.sh stop
3, the test kafka production and consumption
Creating topic
kafka-topics.sh --create --zookeeper node1:2181,node2:2181,node3:2181 --replication-factor 3 --partitions 2 --topic hadoop
参数说明:
–-zookeeper:指定连接zk的服务器,
–-replication-factor:指定副本数量
–-partitions:指定分区数量
–-topic:主题名称
View topic list has been created
kafka-topics.sh --list --zookeeper node1:2181
View a topic for more information
kafka-topics.sh --describe --zookeeper node1:2181 --topic hadoop
Start a topic producer
kafka-console-producer.sh --broker-list node1:9092,node2:9092,node3:9092 --topic hadoop
#弹出以下箭头后输入数据
>Hello World
>welcome to kafka
Start a topic of consumer
kafka-console-consumer.sh --bootstrap-server node1:9092 --topic hadoop --from-beginning
# --from-beginning:从最开始消费数据,没有该参数表示只会消费新产生的数据。
Hello World
welcome to kafka
Complete