Kafka stress test and machine quantity calculation based on project experience

Kafka stress test of project experience

1) Kafka pressure test
Use Kafka's official script to perform pressure test on Kafka. During Kafka stress testing, you can see where the bottleneck (CPU, memory, network IO) appears. Generally, network IO reaches the bottleneck.
kafka-consumer-perf-test.sh
kafka-producer-perf-test.sh
2) Kafka Producer stress test
(1) There are these two files in the /opt/module/kafka/bin directory. Let's test
[atguigu@hadoop102 kafka]$ bin/kafka-producer-perf-test.sh --topic test --record-size 100 --num-records 100000 --throughput 1000 --producer-props bootstrap.servers =hadoop102:9092,hadoop103:9092,hadoop104:9092
Description: record-size is how big a piece of information is, in bytes. num-records is the total number of messages sent. Throughput is how many messages per second.
(2) Kafka will print the following information
5000 records sent, 999.4 records/sec (0.10 MB/sec), 1.9 ms avg latency, 254.0 max latency.
5002 records sent, 1000.4 records/sec (0.10 MB/sec), 0.7 ms avg latency, 12.0 max latency.
5001 records sent, 1000.0 records/sec (0.10 MB/sec), 0.8 ms avg latency, 4.0 max latency.
5000 records sent, 1000.0 records/sec (0.10 MB/sec), 0.7 ms avg latency, 3.0 max latency.
5000 records sent, 1000.0 records/sec (0.10 MB/sec), 0.8 ms avg latency, 5.0 max latency.
Parameter analysis: In this example, a total of 10w messages are written, and 0.10MB of data is written to Kafka every second, the average is 1000 Messages per second, the average delay for each write is 0.8 milliseconds, and the maximum delay is 254 milliseconds.
3) Kafka Consumer stress test
Consumer test, if these four indicators (IO, CPU, memory, network) cannot be changed, consider increasing the number of partitions to improve performance.
[atguigu@hadoop102 kafka]$
bin/kafka-consumer-perf-test.sh --zookeeper hadoop102:2181 --topic test --fetch-size 10000 --messages 10000000 --threads 1
Parameter description:
--zookeeper specifies the zookeeper Link information
-topic specifies the name of the topic
-fetch-size specifies the data size of each fetch -messages the
total number of messages to be consumed
Test result description:
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2019-02-19 20:29:07:566, 2019- 02-19 20:29:12:170, 9.5368, 2.0714, 100010, 21722.4153
start test time, test end data, maximum throughput rate 9.5368MB/s, average consumption per second 2.0714MB/s, maximum consumption per second 100010, The average consumption is 21722.4153 per second.

Calculation of the number of Kafka machines based on project experience

The number of Kafka machines (empirical formula)=2*(peak production speed * number of replicas/100) +1
First, estimate how much data will be generated in a day, and then use Kafka's own production pressure test (only test Kafka write speed, Ensure that the data is not overstocked) and calculate the peak production speed. Based on the set number of replicas, the number of Kafka to be deployed can be estimated.
For example, we use a stress test to test that the write speed is 10M/s, and the peak service data speed is 50M/s. The number of copies is 2.
Number of Kafka machines=2*(50*2/100) + 1=3 units

Guess you like

Origin blog.csdn.net/qq_42706464/article/details/108749456