KAFKA installation based on hadoop and testing java api

1. Experimental purpose

1. Understand the functions of each component of Kafka

2. Master the installation and deployment of Kafka

3. Understand the use of Kafka Java API

4. Understand the Producer API and Consumer API in Kafka Java API

2. Experimental requirements

1. Kafka installation and deployment. Kafka installation depends on Scala and ZooKeeper, so you need to install them first

Scala and ZooKeeper. Then based on the environment where Scala and ZooKeeper have been installed, install

Department Kafka.

2. Use Kafka Java API. Use a simple Java API to simulate Kafka's producer and

consumer, where producer generates content through a while loop and then passes the content to

Kafka, the consumer reads content from Kafka and outputs it in the Console interface.

3. (Optional) Install and deploy the Kafka environment on your computer.

3. Experimental process and results records

(1) Kafka installation and deployment

1. First, create a new /data/kafka1 directory locally on Linux to store the files required for the experiment.

1. mkdir -p /data/kafka1

Switch the directory to /data/kafka1 and use the wget command to download the required installation package scala-

2.10.4.tgz, kafka_2.10-0.8.2.2.tgz and zookeeper-3.4.5-

cdh5.4.5.tar.gz。

1. cd /data/kafka1

2. wget http://172.16.103.12:60000/allfiles/kafka1/scala-

2.10.4.tgz 

3. wget http://172.16.103.12:60000/allfiles/kafka1/kafka_2.10-

0.8.2.2.tgz 

4. wget http://172.16.103.12:60000/allfiles/kafka1/zookeeper-

3.4.5-cdh5.4.5.tar.gz 

2. Install Scala.

Switch to the /data/kafka1 directory and extract the Scala installation package scala-2.10.4.tgz to /apps

directory, and rename the decompressed directory to scala.

1. cd /data/kafka1

2. tar -xzvf /data/kafka1/scala-2.10.4.tgz -C /apps/

3. cd /apps

4. mv /apps/scala-2.10.4/ /apps/scala

Use vim to open user environment variables.

1. sudo vim ~/.bashrc

Append the following Scala path information to the user environment variable. 1. #scala 

2. export SCALA_HOME=/apps/scala

3. export PATH=$SCALA_HOME/bin:$PATH

Execute the source command to make the environment variables take effect.

1. source ~/.bashrc

3. Switch to the /data/kafka1 directory and copy kafka’s compressed package kafka_2.10-0.8.2.2.tgz

Unzip it to the /apps directory, and rename the unzipped directory to kafka.

1. cd /data/kafka1

2. tar -xzvf /data/kafka1/kafka_2.10-0.8.2.2.tgz -C /apps/

3. cd /apps

4. mv /apps/kafka_2.10-0.8.2.2/ /apps/kafka

Use vim to open user environment variables.

1. sudo vim ~/.bashrc

Append the following Kafka path information to the user environment variable.

1. #kafka 

2. export KAFKA_HOME=/apps/kafka

3. export PATH=$KAFKA_HOME/bin:$PATH

Execute the source command to make the environment variables take effect.

1. source ~/.bashrc

4. Since some data of Kafka needs to be stored in ZooKeeper, additional installation is required.

ZooKeeper, or use the ZooKeeper program that comes with the Kafka installation package.

First, let's demonstrate the use of an external ZooKeeper program.

Unzip zookeeper-3.4.5-cdh5.4.5.tar.gz in the /data/kafka1 directory to /apps

directory, and rename the decompressed directory to zookeeper.

1. cd /data/kafka1

2. tar -xzvf /data/kafka1/zookeeper-3.4.5-cdh5.4.5.tar.gz -C

/apps/ 3. cd /apps

4. mv /apps/zookeeper-3.4.5-cdh5.4.5/ /apps/zookeeper

Use vim to open user environment variables.

1. sudo vim ~/.bashrc

Append the following Zookeeper path information to the user environment variable.

1. #zookeeper 

2. export ZOOKEEPER_HOME=/apps/zookeeper

3. export PATH=$ZOOKEEPER_HOME/bin:$PATH

Execute the source command to make the environment variables take effect.

1. source ~/.bashrc

Modify the configuration file of ZooKeeper and configure ZooKeeper in stand-alone mode.

Switch to the directory /apps/zookeeper/conf where the ZooKeeper configuration file is located, and change

zoo_sample.cfg renamed to zoo.cfg

1. cd /apps/zookeeper/conf/

2. mv /apps/zookeeper/conf/zoo_sample.cfg

/apps/zookeeper/conf/zoo.cfg

Use vim to open the zoo.cfg file and modify the dataDir item content

1. vim zoo.cfg

Depend on:

1. dataDir=/tmp/zookeeper

Change to:

1. dataDir=/data/tmp/zookeeper-outkafka/data

The /data/tmp/zookeeper-outkafka/data directory here needs to be created in advance.

1. mkdir -p /data/tmp/zookeeper-outkafka/data starts ZooKeeper and checks the running status of ZooKeeper.

1. cd /apps/zookeeper/bin

2. ./zkServer.sh start

3. ./zkServer.sh status

Shut down ZooKeeper.

1. cd /apps/zookeeper/bin

2. ./zkServer.sh stop

5. Use Kafka's built-in ZooKeeper and change the directory to the /apps/kafka/config directory.

1. cd /apps/kafka/config

A configuration file with similar functions to ZooKeeper's configuration file zoo.cfg is placed here.

zookeeper.properties, use vim to open the zookeeper.properties configuration file.

1. vim zookeeper.properties

Change the dataDir directory to the /data/tmp/zookeeper-inkafka/data directory.

1. dataDir=/data/tmp/zookeeper-inkafka/data

The /data/tmp/zookeeper-inkafka/data directory here must be created in advance.

1. mkdir -p /data/tmp/zookeeper-inkafka/data

Next, start the ZooKeeper service, switch the directory to the /apps/kafka directory, and go to the kafka bin

The ZooKeeper startup script is placed in the directory. Press Ctrl+c to exit.

1. cd /apps/kafka

2. bin/zookeeper-server-start.sh config/zookeeper.properties &

The ampersand at the end will put zookeeper-server-start.sh in the background for execution. Enter jps

1. jps

View the ZooKeeper process QuorumPeerMain 1. zhangyu@8461bfd6a537:/apps/kafka$ jps

2. 375 Jps

3. 293 QuorumPeerMain

4. zhangyu@8461bfd6a537:/apps/kafka$

Next close the ZooKeeper process

1. cd /apps/kafka

2. bin/zookeeper-server-stop.sh stop

6. You can choose the above two ways of using ZooKeeper according to your needs. follow-up courses,

We will use external ZooKeeper by default to manage Kafka data.

At this point Kafka has been installed.

Next, test Kafka to see if it can run normally.

7. Switch to the /apps/zookeeper directory and start the ZooKeeper service.

1. cd /apps/zookeeper

2. bin/zkServer.sh start

8. Switch to the /apps/kafka/config directory, where Kafka-related configuration files are placed.

Use vim to open the configuration file server.properties of the Kafka service.

1. cd /apps/kafka/config

2. vim server.properties

The configuration items in the server.properties file include: basic server configuration, socket service settings

Configuration, log configuration, log refresh strategy, log retention strategy, ZooKeeper configuration.

The basic configuration of the server mainly includes the number of the current node.

The ZooKeeper configuration includes the IP and port number of the ZooKeeper service.

We modify the value of the zookeeper.connect item to:

1. zookeeper.connect=localhost:2181

The IP and port here are the ports used by ZooKeeper to send and receive messages. IP must be

The IP of the ZooKeeper service, we set it to localhost, and the port must be the same as

The clientPort port in zoo.cfg under /apps/zookeeper/conf is consistent. 9. Change the directory to the /apps/kafka directory and start the Kafka service. When starting the Kafka service, it will

Read the server.properties file in the Kafka configuration file directory.

1. cd /apps/kafka

2. bin/kafka-server-start.sh config/server.properties &

This starts the Kafka server and runs it on the backend.

10. Open another window and call the kafka-topic.sh script in the /apps/kafka/bin directory to create

Create a topic.

1. cd /apps/kafka

2. bin/kafka-topics.sh \

3. --create \

4. --zookeeper localhost:2181 \

5. --replication-factor 1 \

6. --topic sayaword \

7. --partitions 1

After the kafka-topic.sh command, you need to add some parameters, such as ZooKeeper configuration and topic name.

Weigh and so on.

Let’s check what topics are available in Kafka

1. bin/kafka-topics.sh --list --zookeeper localhost:2181

11. Call kafka-console-producer.sh in the /apps/kafka/bin directory to produce some messages

Information, producer is the producer.

1. bin/kafka-console-producer.sh --broker-list localhost:9092 --

topic sayaword

The localhost here is the IP of Kafka, and 9092 is the port of the broker node. Users can

On the console interface, input information is handed over to the producer for processing and sent to the consumer.

12. Open a window again, call kafka-console-consumer.sh in the bin directory, and start

consumer, consumer serves as a consumer and is used to consume data.

1. cd /apps/kafka 2. bin/kafka-console-consumer.sh --zookeeper localhost:2181 --

topic sayaword --from-beginning

kafka-console-consumer.sh still needs to add some parameters, such as ZooKeeper’s IP and terminal

port, topic name, read data location, etc.

13. In the interface for executing the kafka-console-producer.sh command, enter a few lines of text and press

Enter. You can see that on the consumer side, the same content will be output.

Producer side:

consumer 端:

14. Exit the test.

在 kafka-console-consumer.sh、kafka-console-producer.sh 及 kafka

server-start.sh In the command line interface, execute Ctrl + c to exit the consumer respectively.

producer 及 server。

Change the directory to the /apps/zookeeper/bin directory and stop ZooKeeper.

1. cd /apps/zookeeper/bin

2. ./zkServer.sh stop

At this point, the installation and testing of Kafka are complete!

(二)Kafka Java API

1. Create the /kafka3 folder in the /data directory.

1. mkdir /data/kafka3

2. Switch to the /data/kafka3 directory in Linux and use the wget command from

Download from http://172.16.103.12:60000/allfiles/kafka3/kafkalib.tar.gz

Text file kafkalib.tar.gz.

1. cd /data/kafka3

2. wget http: //172.16.103.12:60000/allfiles/kafka3/kafkalib.tar.gz  3. After the download is complete, decompress the compressed package to the current directory.

1. tar zxvf kafkalib.tar.gz

4. Open Eclipse and create a new Java project named kafka3. 5. Right-click the project name and create a new package named my.kafka. 6. Add the jar package that the project depends on.

Right-click the project and create a new folder named kafkalib to store the jar packages required by the project. Copy all jar packages in the kafkalib folder in the /data/kafka3 directory to Eclipse

In the kafkalib folder under the kafka3 project. Select all jar packages in the kafkalib folder and add them to the Build Path.

7. Start ZooKeeper. Switch to the /apps/zookeeper/bin directory and execute ZooKeeper startup

Action script.

1. cd /apps/zookeeper/bin

2. ./zkServer.sh start to check the running status of ZooKeeper.

1. ./zkServer.sh status

8. Change the directory to the /apps/kafka directory and start the kafka server.

1. cd /apps/kafka

2. bin/kafka-server-start.sh config/server.properties &

9. Open another window, switch to /apps/kafka, create a topic in kafka, and name it

test kafkaapi。

1. cd /apps/kafka

2. bin/kafka-topics.sh \

3. --create \

4. --zookeeper localhost:2181 \

5. --replication-factor 1 \

6. --topic testkafkaapi \

7. --partitions 1

View topic

1. bin/kafka-topics.sh --list --zookeeper localhost:2181

10. Create a kafka producer for producing data. Under the kafka3 project my.kafka package,

Create a Class and name it MyProducer. Write the code for the MyProducer class.

1. package my.kafka;

2. import java.util.Properties;

3. import kafka.javaapi.producer.Producer;

4. import kafka.producer.KeyedMessage;

5. import kafka.producer.ProducerConfig;

6. public class MyProducer {

7. private final Producer<String, String> producer;

8. public final static String TOPIC = "testkafkaapi";

9. public MyProducer() { 10. Properties props = new Properties();

11.  // What is configured here is the port of kafka 

12. props.put("metadata.broker.list", "localhost:9092");

13.  // Configure the serialization class of value 

14. props.put("serializer.class",

"kafka.serializer.StringEncoder");

15.  // Configure the serialization class of key 

16. props.put("key.serializer.class",

"kafka.serializer.StringEncoder");

17. // request.required.acks 

18. // 0, which means that the producer never waits for an

acknowledgement 

19. // from the broker (the same behavior as 0.7). This

option provides the 

20. // lowest latency but the weakest durability guarantees

(some data will 

21. // be lost when a server fails). 

22. // 1, which means that the producer gets an

acknowledgement after the 

23. // leader replica has received the data. This option

provides better 

24. // durability as the client waits until the server

acknowledges the 

25. // request as successful (only messages that were

written to the 

26. // now-dead leader but not yet replicated will be

lost). 

27. // -1, which means that the producer gets an

acknowledgement after all 

28. // in-sync replicas have received the data. This option

provides the 

29. // best durability, we guarantee that no messages will

be lost as long 

30. // as at least one in sync replica remains. 

31. props.put("request.required.acks", "-1");

32. producer = new Producer<String, String>(new

ProducerConfig(props));

33. } 34. void produce() {

35. int messageNo = 1000;

36. final int COUNT = 10000;

37. while (messageNo < COUNT) {

38. String key = String.valueOf(messageNo);

39. String data = "hello,kafka message:" + key;

40. producer.send(new KeyedMessage<String,

String>(TOPIC, key, data));

41. System.out.println(data);

42. messageNo++;

43. }

44. }

45. 

46. public static void main(String[] args) {

47. new MyProducer().produce();

48. }

49. }

Code on the producer side: first define a topic name, and then create a properties

Instance, used to set the parameters of produce. Then create an instance of producer and configure the parameters

props are uploaded as parameters. Define a key and data in the produce method to create

KeyedMessage instance, upload key, data and topic as parameters, and then

KeyedMessage instance is uploaded to producer. Call MyProduce directly in the main function

The produce() method is used to upload messages.

11. Create kafka consumer. for consuming data. In the kafka3 project, under the my.kafka package,

Create a Class and name it MyConsumer. Write the code for MyConsumer class

1. package my.kafka;

2. import java.util.HashMap;

3. import java.util.List;

4. import java.util.Map;

5. import java.util.Properties;

6. import kafka.consumer.ConsumerConfig;

7. import kafka.consumer.ConsumerIterator; 8. import kafka.consumer.KafkaStream;

9. import kafka.javaapi.consumer.ConsumerConnector;

10.import kafka.serializer.StringDecoder;

11.import kafka.utils.VerifiableProperties;

12.public class MyConsumer {

13. private final ConsumerConnector consumer;

14. 

15. private MyConsumer() {

16. Properties props = new Properties();

17.  //zookeeper configuration 

18. props.put("zookeeper.connect", "localhost:2181");

19.  //group represents a consumer group 

20. props.put("group.id", "mygroup");

21.  //zk connection timeout 

22. props.put("zookeeper.session.timeout.ms", "4000");

23. props.put("zookeeper.sync.time.ms", "200");

24. props.put("auto.commit.interval.ms", "1000");

25. props.put("auto.offset.reset", "smallest");

26.  //Serialization class 

27. props.put("serializer.class",

"kafka.serializer.StringEncoder");

28. ConsumerConfig config = new ConsumerConfig(props);

29. consumer =

kafka.consumer.Consumer.createJavaConsumerConnector(config);

30. }

31. 

32. void consume() {

33. Map<String, Integer> topicCountMap = new

HashMap<String, Integer>();

34. topicCountMap.put(MyProducer.TOPIC, new Integer(1));

35. 

36. StringDecoder keyDecoder = new StringDecoder(new

VerifiableProperties());

37. StringDecoder valueDecoder = new StringDecoder(new

VerifiableProperties());

38. 

39. Map<String, List<KafkaStream<String, String>>>

consumerMap =consumer.createMessageStreams(topicCountMap,keyDecoder,valueDec

or);

40. KafkaStream<String, String> stream =

consumerMap.get(MyProducer.TOPIC).get(0);

41. ConsumerIterator<String, String> it =

stream.iterator();

42. while (it.hasNext())

43. System.out.println(it.next().message());

44. }

45. 

46. public static void main(String[] args) {

47. new MyConsumer().consume();

48. }

49.}

It is divided into two parts on the MyConsumer side: MyConsumer() method and consume() method. exist

Create a properties instance in the MyConsumer() method to configure the consumer performance, and then

Create a consumer instance that receives messages and pass the properties instance as a parameter.

Call the createMessageStreams() method of the consumer class in the cousin() method to receive

The message is passed from kafka, and then the message is output to the console through iterative traversal.

12. Execute the program

Right-click on the MyProducer class in Eclipse and click ==>Run As==>Jave Application

item.

Enter the window below and click OK. 13. Then in the MyConsumer class: right click ==>Run As==>Jave Application option.

Then you can see the output results of the console interface.

Consumer

4. Analysis of experimental results

Changed the environment variables and configured Kafka, using a simple Java API to simulate Kafka's producer and

consumer, where producer generates content through a while loop and then passes the content to

Kafka, the consumer reads content from Kafka and outputs it in the Console interface.

Guess you like

Origin blog.csdn.net/qq_63042830/article/details/134894591