3.1 Download and unzip
3.2 configuration files
3.3 service starts
Five, Kafka's JAVA programming
5.1 Producer Programming
5.2 Consumer Programming
text
I. Introduction
Kafka is the Apache Software Foundation to develop an open source stream processing platform, the Scala and Java to write. Kafka is a high throughput of distributed publish-subscribe messaging system that can handle all the action streaming data consumer-scale site. This action (web browsing, search and other user action) is a key factor in many social functions in modern networks. These data usually due to the required throughput is achieved by the polymerization process log and the log. For like Hadoop as the log data and off-line analysis systems, but requires real-time processing limitations, this is a viable solution. Kafka's purpose is to Hadoop parallel loading mechanism to unify online and offline messaging, but also in order to pass the cluster to provide real-time information.
Two, Kafka's role
Broker: Installation of a cluster of Kafka service that is a broker (broker is to be globally unique id)
Producer: the producer of the message, is responsible for writing data to the broker in (the Push)
Consumer: Consumers message, in charge of reading from kafka data fetch (pull), older consumers need to rely zk, the new version does not require
topic: theme, is the equivalent of a category of data, different topic to store different data
consumer group: consumer groups, a topic can have multiple Meanwhile consumer spending, more consumers if a consumer group, then they can not be repeated consumption data
Three, Kafka installation
3.1 Download and unzip
Spark here is 2.3.3 so I need kafka0.10.2.0 Version: Download
Extract to the appropriate folder: as shown below
3.2 configuration files
Three necessary configuration places:
broker.id = 1 ===> globally unique, three are configured with me here are, 2, 3
in the Listeners = PLAINTEXT: // hd1: 9092 ===> there are two HD2, HD3
# this directory yourself creating, to save data kafka
log.dirs = / usr / local / Hadoop / kafka / data
zookeeper.connect = HD1: 2181, HD2: 2181, HD3: 2181 ===> address ZooKeeper
as follows:
3.3 service starts
./bin/kafka-server-start.sh -daemon /usr/local/hadoop/kafka/kafka_2.10-0.10.2.0/config/server.properties
Four, Kafka's common commands
# Start . / Bin / Kafka used to live-Server-Start. SH -daemon / usr / local / hadoop / Kafka used to live / kafka_2. 10 - 0.10 . 2.0 / config / the server.properties # view those Topic . / Bin / Kafka used to live-Topics. SH --list --zookeeper hd1: 2181 , HD2: 2181 , HD3: 2181 # to create a Topic . / bin / Kafka used to live-Topics. SH --create --zookeeper hd1: 2181 , HD2: 2181 , HD3: 2181 --replication -factor 3 --partitions 3 - Topic the Test # producer data .. / bin / Producer Kafka used to live-Console- SH --broker-List hd1: 9092 , HD2: 9092 , HD3: 9092 - Topic the Test # consumer spending data . / bin / Console-Consumer-Kafka used to live. SH --zookeeper hd1 : 2181 , HD2: 2181 , HD3: 2181 --topic the --from-Beginning the Test
Five, Kafka's JAVA programming
5.1 Producer Programming
import org.apache.kafka.clients.producer.KafkaProducer; import org.apache.kafka.clients.producer.ProducerRecord; import org.apache.kafka.common.serialization.StringSerializer; import java.util.Properties; public class ProduceDemo { public static void main(String[] args){ Properties props = new Properties();//配置项 props.put("bootstrap.servers", "hd1:9092,hd2:9092,hd3:9092");//使用新的API指定kafka集群位置 props.put("key.serializer", StringSerializer.class.getName()); props.put("value.serializer", StringSerializer.class.getName()); KafkaProducer<String, String> producer = new KafkaProducer<String, String>(props); String messageStr = null; for (int i = 1;i<1000;i++){ messageStr = "hello, this is "+i+"th message"; producer.send(new ProducerRecord<String, String>("test","Message",messageStr)); } producer.close(); } }
5.2 Consumer Programming
import org.apache.kafka.clients.consumer.ConsumerRecord; import org.apache.kafka.clients.consumer.ConsumerRecords; import org.apache.kafka.clients.consumer.KafkaConsumer; import org.apache.kafka.common.serialization.StringDeserializer; import java.util.Arrays; import java.util.Properties; public class ConsumerDemo implements Runnable{ private final KafkaConsumer<String, String> consumer; private ConsumerRecords<String, String> msgList; private final String topic; private static final String GROUDID = "groupA"; public ConsumerDemo(String topicName){ Properties props = new Properties(); props.put("bootstrap.servers", "hd1:9092,hd2:9092,hd3:9092"); props.put("group.id", GROUDID); props.put("enable.auto.commit", "true"); props.put("auto.commit.interval.ms", "1000"); props.put("session.timeout.ms", "30000"); props.put("auto.offset.reset", "earliest"); props.put("key.deserializer", StringDeserializer.class.getName()); props.put("value.deserializer", StringDeserializer.class.getName()); this.consumer = new KafkaConsumer<String, String>(props); this.topic = topicName; this.consumer.subscribe(Arrays.asList(topic)); } public void run(){ int messageNum = 1; try{ for (;;){ msgList = consumer.poll(500); if (msgList!=null && msgList.count()>0){ for (ConsumerRecord<String, String> record : msgList){ if (messageNum % 50 ==0){ System.out.println(messageNum+"=receive: key = " + record.key() + ", value = " + record.value()+" offset==="+record.offset()); } if (messageNum % 1000 == 0) break; messageNum++; } } else{ Thread.sleep(1000); } } } catch (InterruptedException e){ e.printStackTrace(); } finally{ consumer.close(); } } public static void main(String[] args){ ConsumerDemo demo = new ConsumerDemo("test"); Thread thread = new Thread(demo); thread.start(); } }