kafka data regularly delete experiment

Due to the hard disk is too small, too much data had to save time data from the default of seven days becomes one day 

Setting parameter data log.retention.hours = 24 3 days before it was found still exists 

The official inquiry found documents

segment activity is not compressed, do a few experiments to see if there is no way to delete data has been issued topic

 

experiment one

Add a topic of every 10s testdelet last data transmission, to see if a one hour hour before the data is cleared, the difference between: the end of each record transmitting connection is closed kafka

image.png

14:59 min observation period, regardless of the conclusions found not to delete useless piece

 

 

The second experiment

Modify server.properties d file settings

log.retention.hours=1

log.retention.ms=3600000

log.cleanup.polict=delete

log.segment.delete.delay.ms=60000

log.cleanup.interval.mins=10

log.retention.check.interval.ms=600000

To see if there are different variations

The conclusions and data before there is no difference no difference, the experiment is invalid

 

Experiment three:

View kafka tool in reading the data is deleted will have an impact

The experiment had no effect on the conclusions invalid

 

Experiment 4:

The data will be deleted after about 15.10 points 200 to send data to the control, how much time to read

image.png

 

 

Experiment 5

Stop topic of sending a group to see if there is a change, modify write

image.png

./kafka-configs.sh   --entity-name simulator_250 --zookeeper localhost:2181 --entity-type topics  --alter --add-config  segment.bytes=10000000,retention.ms=3600000,retention.bytes=20000000

1. Turn off the sender to see if historical data will be deleted after 10 minutes

Conclusion The data will not be deleted

2. Delete the topic from consumers, to see if valid

Conclusions not deleted

3. Wait delete time

image.png

image.png

 

Conclusion In the last data 00000000.log wait an hour after the data is deleted 

image.png

Other log after also being deleted over 1 hour

 

Conclusion kafka whether to delete the data and whether data is read consumption does not matter, and time of the last segment of the last interval data related

 

Sixth experiment

Modify segment.bytes = 10000000, retention.ms = 3600000

image.png

You are able to delete the old file fragments after watching one hour

result:

image.png

Can delete to delete segment start time of 11.20, 14.10 points actually delete the last time time 14:50 minutes  

 

 

Conclusion slice can delete other topic not deleted

 

All experimental conclusion:

kafka want to delete the data you need to set log.roll.hours (rotation time) or segment.bytes (file size) Control log file a maximum of how much or how long slice, the last time when the log file reaches the log.segment after .hours time, on a log will be deleted, or the topic is no waiting log.segment.hours time data access to the topic will be deleted

Or the default segment reached the automatic sheet 1G or reach 7 days automatically lead to fragmentation of the old data is deleted

 

 The default deleted in rotation time is 10 minutes, so time may be part of a little difference 

All joined in the configuration log.roll.hours = 12 can solve both problems

 

Guess you like

Origin www.cnblogs.com/skycandy/p/11402214.html