Due to the hard disk is too small, too much data had to save time data from the default of seven days becomes one day
Setting parameter data log.retention.hours = 24 3 days before it was found still exists
The official inquiry found documents
segment activity is not compressed, do a few experiments to see if there is no way to delete data has been issued topic
experiment one
Add a topic of every 10s testdelet last data transmission, to see if a one hour hour before the data is cleared, the difference between: the end of each record transmitting connection is closed kafka
14:59 min observation period, regardless of the conclusions found not to delete useless piece
The second experiment
Modify server.properties d file settings
log.retention.hours=1
log.retention.ms=3600000
log.cleanup.polict=delete
log.segment.delete.delay.ms=60000
log.cleanup.interval.mins=10
log.retention.check.interval.ms=600000
To see if there are different variations
The conclusions and data before there is no difference no difference, the experiment is invalid
Experiment three:
View kafka tool in reading the data is deleted will have an impact
The experiment had no effect on the conclusions invalid
Experiment 4:
The data will be deleted after about 15.10 points 200 to send data to the control, how much time to read
Experiment 5
Stop topic of sending a group to see if there is a change, modify write
./kafka-configs.sh --entity-name simulator_250 --zookeeper localhost:2181 --entity-type topics --alter --add-config segment.bytes=10000000,retention.ms=3600000,retention.bytes=20000000
1. Turn off the sender to see if historical data will be deleted after 10 minutes
Conclusion The data will not be deleted
2. Delete the topic from consumers, to see if valid
Conclusions not deleted
3. Wait delete time
Conclusion In the last data 00000000.log wait an hour after the data is deleted
Other log after also being deleted over 1 hour
Conclusion kafka whether to delete the data and whether data is read consumption does not matter, and time of the last segment of the last interval data related
Sixth experiment
Modify segment.bytes = 10000000, retention.ms = 3600000
You are able to delete the old file fragments after watching one hour
result:
Can delete to delete segment start time of 11.20, 14.10 points actually delete the last time time 14:50 minutes
Conclusion slice can delete other topic not deleted
All experimental conclusion:
kafka want to delete the data you need to set log.roll.hours (rotation time) or segment.bytes (file size) Control log file a maximum of how much or how long slice, the last time when the log file reaches the log.segment after .hours time, on a log will be deleted, or the topic is no waiting log.segment.hours time data access to the topic will be deleted
Or the default segment reached the automatic sheet 1G or reach 7 days automatically lead to fragmentation of the old data is deleted
The default deleted in rotation time is 10 minutes, so time may be part of a little difference
All joined in the configuration log.roll.hours = 12 can solve both problems