cent0S 7 安装 cdh5.13 笔记之二:
安装java,zookeeper,kafka
上期: http://blog.csdn.net/ubuntu64fan/article/details/78791574
------------------------------------------------------------------------ | n1 | n2 | n3 | pkgtype ------------------------------------------------------------------------ jdk-8u152-linux-x64.tar.gz Y Y Y zookeeper-server Y Y Y yum kafka_2.11-1.0.0.tgz Y Y Y hadoop-hdfs-namenode Y Y N yum hadoop-hdfs-datanode Y Y Y yum ------------------------------------------------------------------------
2 在每台机器上安装java
# wget ftp://pub:[email protected]/tarball/jdk-8u152-linux-x64.tar.gz # wget ftp://pub:[email protected]/tarball/kafka_2.11-1.0.0.tgz
解压目录参考下面文件的配置, 然后修改 /etc/profile
export ZK_HOME=/usr/lib/zookeeper export KAFKA_HOME=/usr/lib/kafka export JAVA_HOME=/usr/local/java/jdk1.8.0_152 export JRE_HOME=$JAVA_HOME/jre export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$ZK_HOME/bin:$KAFKA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar
执行更改:
# source /etc/profile # ln -s /usr/local/java/jdk1.8.0_152/bin/java /usr/bin/java # 此句非常重要,否则zookeeper-server无法启动
3 每台节点上配置cdh5 repo
vi /etc/yum.repo.d/cloudera-cdh5.repo
[cloudera-cdh5] # Packages for Cloudera's Distribution for Hadoop, Version 5, on RedHat or CentOS 6,7 x86_64 name = Cloudera's Distribution for Hadoop, Version 5 enabled = 1 gpgcheck = 0 baseurl = ftp://pub:[email protected]/cloudera-5.13/cdh/5/ gpgkey = ftp://pub:[email protected]/cloudera-5.13/cdh/RPM-GPG-KEY-cloudera
vi /etc/yum.repo.d/cloudera-gplextras5.repo
[cloudera-gplextras5] # Packages for Cloudera's GPLExtras, Version 5, on RedHat or CentOS 6 x86_64 name = Cloudera's GPLExtras, Version 5 enabled = 1 gpgcheck = 0 baseurl = ftp://pub:[email protected]/cloudera-5.13/gplextras/5/ gpgkey = ftp://pub:[email protected]/cloudera-5.13/gplextras/RPM-GPG-KEY-cloudera
4 在每台机器上安装zookeeper
执行下面的安装命令:
mkdir -p /data/zookeeper # 创建放置zookeeper数据的目录 mkdir -p /data/log/zookeeper # 放置日志的位置 yum install -y zookeeper-server # 安装 vi /etc/zookeeper/conf/zoo.cfg # 编辑配置文件
zoo.cfg内容如下:
######################################################################## # zoo.cfg - config file for zookeeper-server ######################################################################## maxClientCnxns=50 tickTime=2000 initLimit=10 syncLimit=5 dataDir=/data/zookeeper clientPort=2181 dataLogDir=/data/log/zookeeper # cluster: server.1=cent7-n1.pepstack.com:2888:3888 server.2=cent7-n2.pepstack.com:2888:3888 server.3=cent7-n3.pepstack.com:2888:3888
然后执行初始化命令(--myid=?的?要用1,2,3,...的数字代替, 如果在n1上就是--myid=1,以此类推):
chown zookeeper:zookeeper /data/zookeeper chown zookeeper:zookeeper /data/log/zookeeper service zookeeper-server init --myid=?然后启动服务:
service zookeeper-server start service zookeeper-server status如果显示无法启动(zookeeper-server is not running.),确保 /usr/bin/java可以调用。否则按下面的方法添加:
# ln -s /usr/local/java/jdk1.8.0_152/bin/java /usr/bin/java # /usr/bin/java -version
java version "1.8.0_152"
Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
5 在每台机器上安装daemontools
yum install gcc tcsh # gcc, csh 要求 wget ftp://pub:[email protected]/prepare/daemontools-0.76.tar.gz tar -zxf daemontools-0.76.tar.gz mv daemontools-0.76 /var/lib/ # 非常重要, 安装完成后不能删除: /var/lib/daemontools-0.76 chown root:root -R /var/lib/daemontools-0.76 chmod 755 /var/lib/daemontools-0.76 cd /var/lib/daemontools-0.76/ # 必须进入此目录才能编译 ./package/install sed -i '$acsh -cf \x27\/command\/svscanboot &\x27' /etc/rc.d/rc.local
最后一句命令是在 ‘/etc/rc.d/rc.local’ 文件上加上了一句:
csh -cf '/command/svscanboot &'
启用:
sh /etc/rc.local 或 systemctl start rc-local
检查服务是否启动:
ps -ef|grep svscanbootroot 13716 1 0 04:44 pts/1 00:00:00 /bin/sh /command/svscanboot
使机器重启后生效:
chmod +x /etc/rc.d/rc.local # 必须有,否则重启不能启动 svscanboot 服务
解决服务引起关机无限等待问题:
A stop job is running for /etc/rc.d/rc.local Compatibillity (已经用时间/ no limit)
http://blog.csdn.net/zhanzc1/article/details/78680439
关闭rc-local服务(下面的命令会导致 svscanboot和supervise进程关闭):
# systemctl stop rc-local # systemctl daemon-reload
如果任何故障导致无法关闭,系统会一直等待不关闭 --- 关机无限等待问题。下面解决这个问题:
编辑:
# vi /usr/lib/systemd/system/rc-local.service
在 [Service]段里增加:
TimeoutStopSec=5
这个默认值是0. systemd在关于这个服务时 会在超过 TimeoutStopSec 的时间后发 信号9 给剩下的进程. 不会一直等着. 这样就算能解决这个问题了.
使zookeeper-server永远运行
编辑文件 /var/lib/zookeeper/zookeeper-server/run
mkdir /var/lib/zookeeper/zookeeper-server vi /var/lib/zookeeper/zookeeper-server/run就2句话:
#!/bin/sh exec service zookeeper-server start然后执行命令:
chmod a+x /var/lib/zookeeper/zookeeper-server/run ln -s /var/lib/zookeeper/zookeeper-server /service/zookeeper-server
这样,zookeeper-server就会永远运行了(无论机器重启)。
测试:
jps 4954 QuorumPeerMain # zookeeper进程id 名称 jps 4954 QuorumPeerMain service zookeeper-server stop # 关闭zookeeper-server 不起作用就对了 jps 1932 QuorumPeerMain jps 1932 QuorumPeerMain jps 1932 QuorumPeerMain
返回到客户机(非zookeeper-server节点机器),测试是否连通:
telnet 192.168.122.223 2181 telnet 192.168.122.224 2181 telnet 192.168.122.225 2181
如果没有连通,需要关闭n1, n2, n3的防火墙:
systemctl stop firewalld.service systemctl disable firewalld.service
同时关闭 selinux:
setenforce 0 vi /etc/selinux/config, 设置如下: SELINUX=disabled
6 在每台机器上安装kafka
下载解压到: /usr/local/ kafka_2.11-1.0.0
wget ftp://pub:[email protected]/tarball/kafka_2.11-1.0.0.tgz ln -s /usr/local/kafka_2.11-1.0.0 /usr/lib/kafka
修改 /usr/lib/kafka/config/server.properties,如下(cent7-n1.pepstack.com 上):
######################################################################## # server.properties for kafka server # see kafka.server.KafkaConfig for additional details and defaults ######################################################################## ############################# Server Basics ############################# broker.id=1 ############################# Socket Server Settings ############################# host.name=cent7-n1.pepstack.com advertised.host.name=cent7-n1.pepstack.com advertised.port=9092 listeners=PLAINTEXT://cent7-n1.pepstack.com:9092 advertised.listeners=PLAINTEXT://cent7-n1.pepstack.com:9092 #listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL num.network.threads=3 num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 ############################# Log Basics ############################# log.dirs=/data/log/kafka num.partitions=1 num.recovery.threads.per.data.dir=1 ############################# Internal Topic Settings ############################# offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 ############################# Log Flush Policy ############################# #log.flush.interval.messages=10000 #log.flush.interval.ms=1000 ############################# Log Retention Policy ############################# log.retention.hours=168 #log.retention.bytes=1073741824 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000 ############################# Zookeeper ############################# zookeeper.connect=cent7-n1.pepstack.com:2181,cent7-n2.pepstack.com:2181,cent7-n3.pepstack.com:2181 zookeeper.connection.timeout.ms=6000 ############################# Group Coordinator Settings ############################# group.initial.rebalance.delay.ms=0
对 n2(?=2), n3(?=3), 只需要修改下面的 ? 为数值2, 3:
broker.id=? host.name=cent7-n?.pepstack.com advertised.host.name=cent7-n?.pepstack.com advertised.port=9092 listeners=PLAINTEXT://cent7-n?.pepstack.com:9092 advertised.listeners=PLAINTEXT://cent7-n?.pepstack.com:9092
kafka需要保存数据在下面的目录中(log.dirs=/data/log/kafka),日志保存在(/var/log/kafka)。因此需要在每台机器上创建目录:
mkdir -p /data/log/kafka mkdir -p /var/log/kafka
在每台机器上放置一个启动文件($KAFKA_HOME/kafka-server.sh), kafka-server.sh内容全文如下:
#!/bin/sh # # chkconfig: 345 99 01 # description: Kafka # # File : Kafka # # Description: kafka-server status|restart|start|stop|hardstop ######################################################################## _file=$(readlink -f $0) _cdir=$(dirname $_file) _name=$(basename $_file) source /etc/rc.d/init.d/functions KAFKA_HOME=$KAFKA_HOME KAFKA_USER=root # See how we were called. case "$1" in start) echo -n "Starting Kafka ..." /sbin/runuser $KAFKA_USER -c "nohup $KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server.properties>/var/log/kafka/server.out 2>/var/log/kafka/server.err &" echo " done." exit 0 ;; stop) echo -n "Stopping Kafka ..." /sbin/runuser $KAFKA_USER -c "ps -ef | grep kafka.Kafka | grep -v grep | awk '{print \$2}' | xargs kill" echo " done." exit 0 ;; hardstop) echo -n "Stopping (hard) Kafka ..." /sbin/runuser $KAFKA_USER -c "ps -ef | grep kafka.Kafka | grep -v grep | awk '{print \$2}' | xargs kill -9" echo " done." exit 0 ;; status) c_pid=`ps -ef | grep kafka.Kafka | grep -v grep | awk '{print $2}'` if [ "$c_pid" = "" ] ; then echo "Stopped" exit 3 else echo "Running $c_pid" exit 0 fi ;; restart) stop start ;; *) echo "Usage: $_name {start|stop|hardstop|status|restart}" exit 1 ;; esac
7 最后在每台机器上启动kafka
kafka-server.sh start kafka-server.sh status
如果显示为 Stopped,则需要检查:
1) 每台节点上是否正确配置了 /etc/hosts
2) 每台节点上是否创建了 kafka 的目录(log.dirs=/data/log/kafka 和 /var/log/kafka)
3) zookeeper-server是否启用
4) zookeeper 的2181端口是否连通(telnet ip 2181)
5) 是否关闭了firewalld.service
检查日志文件: /var/log/kafka/server.out
如果kafka正确启动,执行下面的命令创建一个topic:
# kafka-topics.sh --list --zookeeper cent7-n2.pepstack.com:2181 # kafka-topics.sh --create --topic test --replication-factor 3 --partitions 1 --zookeeper cent7-n2.pepstack.com:2181 Created topic "test". # kafka-topics.sh --list --zookeeper cent7-n2.pepstack.com:2181 test
当kafka要作为高可用服务,需要用daemontools托管起来,运行下面的命令:
mkdir -p /var/lib/kafka/kafka-server vi /var/lib/kafka/kafka-server/run # run 的内容为下面的2行 #!/bin/bash /usr/lib/kafka/bin/kafka-server-start.sh /usr/lib/kafka/config/server.properties chmod +x /var/lib/kafka/kafk-server/run ln -s /var/lib/kafka/kafka-server /service/kafka-server
这样kafka就永远不会停止运行。开机重启会自动运行!
8 测试Kafka
在n1上,开一个终端,发送消息至kafka(n1模拟producer)
$ kafka-console-producer.sh --broker-list cent7-n1.pepstack.com:9092 --sync --topic test
在发送消息的终端输入:Hello Kafka
在n3上,开一个终端,显示消息的消费(n3模拟consumer)
$ kafka-console-consumer.sh --bootstrap-server cent7-n2.pepstack.com:9092 --topic test --from-beginning
在消费消息的终端显示:Hello Kafka
成功!
9 安装kafka-python
为 kafka安装python环境(six-1.9.0在安装salt-minion的时候已经安装,因此不用安装)
wget ftp://pub:[email protected]/prepare/six-1.9.0.tar.gz wget ftp://pub:[email protected]/tarball/kafka-python-1.3.5.tgz tar -zxf six-1.9.0.tar.gz tar -zxf kafka-python-1.3.5.tgz cd six-1.9.0 python setup.py build sudo python setup.py install cd .. cd kafka-python-1.3.5 python setup.py build sudo python setup.py install cd .. python -c "import kafka"