Kafka basic commands
#Start server . / Bin / kafka-server-start. Sh config / server.properties #Create topic (topic) test . / Bin / kafka-topics. Sh --create --zookeeper localhost: 2181 --replication-factor 1 -partitions 1 - topic test #Delete topics . / bin / kafka-topics. sh --zookeeper localhost: 2181 --delete- topic test # – Note: If server.properties is not configured in the configuration file loaded when kafaka starts delete.topic.enable = true , then the deletion at this time is not a real deletion, but the topic is marked as: marked for deletion # – If you want to actually delete it at this time, you can log in to the zookeeper client and enter the terminal, Delete the corresponding node #View topic . / Bin / kafka-topics. Sh --list --zookeeper localhost: 2181 #View topic test details . / Bin / kafka-topics. Sh --describe --zookeeper localhost: 2181 - topic test # Consumer reads the message . / Bin / kafka-console-consumer. Sh --zookeeper master: 2181 --topic badou --from- beginning #Producer sends a message . / Bin / kafka-console-producer. Sh --broker-list master : 9092 --topic badou
Build a logging system with Kafka and Flume
1. The master node and slave node start zookeeper
./bin/zkServer.sh start
2. Start kafka
#启动server ./bin/kafka-server-start.sh config/server.properties #创建topic badou ./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 -partitions 1 --topic badou #Consumer读消息 ./bin/kafka-console-consumer.sh --zookeeper master:2181 --topic badou --from-beginning
3. Start Flume
./bin/flume-ng agent -c conf -f conf/flume_kafka.conf -n a1 -Dflume.root.logger=INFO,console
Flume configuration file flume_kafka.conf
# Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe / configure the source a1.sources.r1.type = exec a1.sources.r1.command = tail -f / home / badou / flume_test / flume_exec_test.txt # a1.sinks.k1.type = logger # set kafka receiver a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink # set kafka broker address and port No. a1.sinks.k1.brokerList = master: 9092 # Set the topic of Kafka a1.sinks.k1.topic=badou # 设置序列化的方式 a1.sinks.k1.serializer.class=kafka.serializer.StringEncoder # use a channel which buffers events in memory a1.channels.c1.type=memory a1.channels.c1.capacity = 100000 a1.channels.c1.transactionCapacity = 1000 # Bind the source and sink to the channel a1.sources.r1.channels=c1 a1.sinks.k1.channel=c1
4. Execute python script
Simulate writing backend logs to log files
python flume_data_write.py
Python code:
# -*- coding: utf-8 -*- import random import time import pandas as pd import json writeFileName="./flume_exec_test.txt" cols = ["order_id","user_id","eval_set","order_number","order_dow","hour","day"] df1 = pd.read_csv('/mnt/hgfs/share_folder/00-data/orders.csv') df1.columns = cols df = df1.fillna(0) with open(writeFileName,'a+')as wf: for idx,row in df.iterrows(): d = {} for col in cols: d[col]=row[col] js = json.dumps(d) wf.write(js+'\n')