How to determine the number of kafka nodes, the number of topics, and the number of topic partitions

① Number of kafka nodes
Kafka number of machines (empirical formula) = 2* ( number of copies of




peak production speed /100) + 1 peak speed: For example, the peak speed of flume reading log files and writing data to Kafka, you have to ask the upstream business team of the company to obtain copies Number: The number of copies of the topic, generally 2 (3), first get the peak production speed, and then based on the set number of copies, the number of Kafka that needs to be deployed can be estimated. For example, our peak production speed is 50M/s. The number of copies is 2. Number of Kafka machines = 2 (50*2/100) + 1=3 units

②Determine the number of topics.
One topic is one type of data. Just build as many topics as there are types of data.

③Number of topic partitions
1) Create a topic with only 1 partition
2) Test the producer throughput and consumer throughput of this topic. (Use kafka stress test tool)
3) Suppose their values ​​are Tp and Tc, and the unit can be MB/s.
4) Then suppose the total target throughput is Tt, then the number of partitions = Tt / min (Tp, Tc)
For example: producer throughput = 20m/s; consumer throughput = 50m/s, expected throughput 100m/s;
partition Number = 100/20 = 5 Partitions
https://blog.csdn.net/weixin_42641909/article/details/89294698 The
number of partitions is generally set to: 3-10

Guess you like

Origin blog.csdn.net/xie670705986/article/details/112668310