1. Download
download link:
http://kafka.apache.org/downloads.html
http://mirrors.hust.edu.cn/apache/
2. Installation prerequisites (zookeeper installation)
Refer to http://www.cnblogs.com/qingyunzong/p/8634335.html#_label4_0
3. Installation
The version used here is kafka_2.11-0.8.2.0.tgz
2.1 Upload and decompress
[hadoop@hadoop1 ~]$ tar -zxvf kafka_2.11-0.8.2.0.tgz -C apps
[hadoop@hadoop1 ~]$ cd apps/
[hadoop@hadoop1 apps]$ ln -s kafka_2.11-0.8.2.0/ kafka
2.2 Modify the configuration file
Enter the kafka installation configuration directory
[hadoop@hadoop1 ~]$ cd apps/kafka/config/
Main concern: The file server.properties is enough, we can find it in the directory:
There are many files, you can find the Zookeeper file here, we can start it according to the zk cluster that comes with Kafka, but it is recommended to use an independent zk cluster
server.properties ( broker.id and host.name are different for each node )
// The unique identifier of the current machine in the cluster, which is the same as zookeeper's myid property broker.id= 0 // The default port for kafka to provide external services is 9092 port= 9092 // This parameter is closed by default, in 0.8.1 There is a bug, DNS resolution problem, and failure rate problem. host.name= hadoop1 // This is the number of threads used by the borker for network processing num.network.threads= 3 // This is the number of threads used by the borker for I/O processing num.io.threads= 8 // The size of the sending buffer , the data is not sent all at once, it is first stored in the buffer and sent after reaching a certain size, which can improve performance socket.send.buffer.bytes= 102400 // kafka receives the buffer size, when the data reaches a certain size In serialization to disk socket.receive.buffer.bytes= 102400 // This parameter is the maximum number of requests to request messages to kafka or send messages to kafka. This value cannot exceed the stack size of java socket.request.max.bytes =104857600 // The directory where messages are stored. This directory can be configured as a "," comma-separated expression. The above num.io.threads must be greater than the number of this directory. // If multiple directories are configured, the newly created The topic where he persists messages is that in the current comma-separated directory, the one with the least number of partitions is placed in the log.dirs=/home/hadoop/log/kafka- logs // The default number of partitions, a topic defaults to 1 partition number num.partitions= 1 // The number of threads used for log recovery in each data directory num.recovery.threads.per.data.dir= 1 // The maximum persistence time of the default message, 168 hours, 7 days log.retention.hours= 168 // This parameter is: because kafka's message is in the form of appending to the file, when it exceeds this value, kafka will create a new file log.segment.bytes= 1073741824 // every 300000 milliseconds to check the log expiration time configured above log.retention.check.interval.ms= 300000 //Whether to enable log compression, generally do not have to be enabled, if enabled, it can improve performance log.cleaner.enable= false // Set the connection port of zookeeper zookeeper.connect= 192.168 . 123.102 : 2181 , 192.168 . 123.103 : 2181 , 192.168 . 123.104 : 2181 / / Set the connection timeout time of zookeeper zookeeper.connection.timeout.ms= 6000
producer.properties
metadata.broker.list=192.168.123.102:9092,192.168.123.103:9092,192.168.123.104:9092
consumer.properties
zookeeper.connect=192.168.123.102:2181,192.168.123.103:2181,192.168.123.104:2181
2.3 Distributing the kafka installation package to other nodes
[hadoop@hadoop1 apps]$ scp -r kafka_2.11-0.8.2.0/ hadoop2:$PWD [hadoop@hadoop1 apps]$ scp -r kafka_2.11-0.8.2.0/ hadoop3:$PWD [hadoop@hadoop1 apps]$ scp -r kafka_2.11-0.8.2.0/ hadoop4:$PWD
2.4 Create a soft link
[hadoop@hadoop1 apps]$ ln -s kafka_2.11-0.8.2.0/ kafka
2.5 Modify environment variables
[hadoop@hadoop1 ~]$ vi .bashrc
#Kafka export KAFKA_HOME=/home/hadoop/apps/kafka export PATH=$PATH:$KAFKA_HOME/bin
save it to take effect immediately
[hadoop@hadoop1 ~]$ source ~/.bashrc
3. Start
3.1 First start the zookeeper cluster
All zookeeper nodes need to execute
[hadoop@hadoop1 ~]$ zkServer.sh start
3.2 Start the Kafka cluster service
[hadoop@hadoop1 kafka]$ bin/kafka-server-start.sh config/server.properties
hadoop1
Hadoop2
hadoop3
hadoop4
3.3 Created topic
[hadoop@hadoop1 kafka]$ bin/kafka-topics.sh --create --zookeeper hadoop1:2181 --replication-factor 3 --partitions 3 --topic topic2
3.4 View topic replica information
[hadoop@hadoop1 kafka]$ bin/kafka-topics.sh --describe --zookeeper hadoop1:2181 --topic topic2
3.5 View the created topic information
[hadoop@hadoop1 kafka]$ bin/kafka-topics.sh --list --zookeeper hadoop1:2181
3.6 The producer sends a message
[hadoop@hadoop1 kafka]$ bin/kafka-console-producer.sh --broker-list hadoop1:9092 --topic topic2
hadoop1 shows message received
3.7 Consumers consume messages
Consume messages on hadoop2
[hadoop@hadoop2 kafka]$ bin/kafka-console-consumer.sh --zookeeper hadoop1:2181 --from-beginning --topic topic2