Kafka source code analysis (3) log management - LogManager

1 entrance

/* start log manager */
        // 启动日志管理模块
        logManager = LogManager(config, zkUtils, brokerState, kafkaScheduler, time, brokerTopicStats)
        logManager.startup()

2 open the code

/**
   *  Start the background threads to flush logs and do log cleanup
    *  启动后台线程来冲洗日志和日志清理 依赖多线程
   */
  def startup() {
    /* Schedule the cleanup task to delete old logs */
    if(scheduler != null) {
      info("Starting log cleanup with a period of %d ms.".format(retentionCheckMs))
      scheduler.schedule("kafka-log-retention",
                         cleanupLogs _,
                         delay = InitialTaskDelayMs,
                         period = retentionCheckMs,
                         TimeUnit.MILLISECONDS)
      info("Starting log flusher with a default period of %d ms.".format(flushCheckMs))
      scheduler.schedule("kafka-log-flusher",
                         flushDirtyLogs _,
                         delay = InitialTaskDelayMs,
                         period = flushCheckMs,
                         TimeUnit.MILLISECONDS)
      scheduler.schedule("kafka-recovery-point-checkpoint",
                         checkpointRecoveryPointOffsets _,
                         delay = InitialTaskDelayMs,
                         period = flushRecoveryOffsetCheckpointMs,
                         TimeUnit.MILLISECONDS)
      scheduler.schedule("kafka-log-start-offset-checkpoint",
                         checkpointLogStartOffsets _,
                         delay = InitialTaskDelayMs,
                         period = flushStartOffsetCheckpointMs,
                         TimeUnit.MILLISECONDS)
      scheduler.schedule("kafka-delete-logs",
                         deleteLogs _,
                         delay = InitialTaskDelayMs,
                         period = defaultConfig.fileDeleteDelayMs,
                         TimeUnit.MILLISECONDS)
    }
    if(cleanerConfig.enableCleaner)
      cleaner.startup()
  }

3 core code

image.png

3.1 Related configuration information

  • The configuration item log.cleaner.threads, the default value is 1. It is used to configure the number of threads for cleaning expired logs (for log merging).

  • The configuration item log.cleaner.dedupe.buffer.size, the default value is 128MB, is used to configure the memory buffer for cleaning expired data. This is used for data cleaning, when the compression method is selected, it is used to clean and sort the memory for duplicate data. Used for log merging.

  • The configuration item log.cleaner.io.buffer.load.factor, the default value is 0.9, is used to configure the data load factor for cleaning the memory buffer, mainly for hashing. The smaller the factor, the smaller the repetition of the bucket may be, but The larger the memory usage is, it is used for log merging.

  • The configuration item log.cleaner.io.buffer.size, the default value is 512KB, is used to clean the IO buffer size of expired data for log merging.

  • The configuration item message.max.bytes, the default value is 1000012 bytes, is used to set the maximum size of a single piece of data.

  • The configuration item log.cleaner.io.max.bytes.per.second is used to control the IO speed limit when cleaning expired data. The default speed is not limited, which is used for log merging.

  • The configuration item log.cleaner.backoff.ms is used to periodically check whether the log needs to be cleaned up (this is mainly used when the log is merged), the default is 15 seconds.

  • The configuration item log.cleaner.enable, whether to enable regular cleaning of logs, is enabled by default.

  • The configuration item num.recovery.threads.per.data.dir is used for the number of threads used for log recovery at startup. The default is 1.

  • The configuration item log.flush.scheduler.interval.ms is used to check whether the log is flushed to disk, and it is not checked by default.

  • The configuration item log.flush.offset.checkpoint.interval.ms is used to save the time interval of the partition offset at regular intervals. The default value is 60000ms.

  • The configuration item log.retention.check.interval.ms, the interval for regularly checking the retention log, the default value is 5 minutes.

3.2 Startup step zk module

// 首先先在zk 读取日志  这块就不多解释了 
   val cleanerConfig = CleanerConfig(numThreads = config.logCleanerThreads,
     dedupeBufferSize = config.logCleanerDedupeBufferSize,
     dedupeBufferLoadFactor = config.logCleanerDedupeBufferLoadFactor,
     ioBufferSize = config.logCleanerIoBufferSize,
     maxMessageSize = config.messageMaxBytes,
     maxIoBytesPerSecond = config.logCleanerIoMaxBytesPerSecond,
     backOffMs = config.logCleanerBackoffMs,
     enableCleaner = config.logCleanerEnable)

   new LogManager(logDirs = config.logDirs.map(new File(_)).toArray,
     topicConfigs = topicConfigs,
     defaultConfig = defaultLogConfig,
     cleanerConfig = cleanerConfig,
     ioThreads = config.numRecoveryThreadsPerDataDir,
     flushCheckMs = config.logFlushSchedulerIntervalMs,
     flushRecoveryOffsetCheckpointMs = config.logFlushOffsetCheckpointIntervalMs,
     flushStartOffsetCheckpointMs = config.logFlushStartOffsetCheckpointIntervalMs,
     retentionCheckMs = config.logCleanupIntervalMs,
     maxPidExpirationMs = config.transactionIdExpirationMs,
     scheduler = kafkaScheduler,
     brokerState = brokerState,
     time = time,
     brokerTopicStats = brokerTopicStats)
 }

3.3 Start the running process

threadsafe
class LogManager(val logDirs: Array[File],
                 val topicConfigs: Map[String, LogConfig], // note that this doesn't get updated after creation
                 val defaultConfig: LogConfig,
                 val cleanerConfig: CleanerConfig,
                 ioThreads: Int,
                 val flushCheckMs: Long,
                 val flushRecoveryOffsetCheckpointMs: Long,
                 val flushStartOffsetCheckpointMs: Long,
                 val retentionCheckMs: Long,
                 val maxPidExpirationMs: Int,
                 scheduler: Scheduler,
                 val brokerState: BrokerState,
                 brokerTopicStats: BrokerTopicStats,
                 time: Time) extends Logging {
  val RecoveryPointCheckpointFile = "recovery-point-offset-checkpoint"
  val LogStartOffsetCheckpointFile = "log-start-offset-checkpoint"
  val LockFile = ".lock"
  val InitialTaskDelayMs = 30*1000

  private val logCreationOrDeletionLock = new Object
  private val logs = new Pool[TopicPartition, Log]()
  private val logsToBeDeleted = new LinkedBlockingQueue[Log]()
//  检查日志目录是否被创建,如果没有创建目录,同时检查目录是否有读写的权限.
  createAndValidateLogDirs(logDirs)
//  生成每个目录的.lock文件,并通过这个文件锁定这个目录.
  private val dirLocks = lockLogDirs(logDirs)
//  根据每个目录下的recovery-point-offset-checkpoint文件,生成出checkpoints的集合.这个用于定期更新每个partition的offset记录.
  private val recoveryPointCheckpoints = logDirs.map(dir => (dir, new OffsetCheckpointFile(new File(dir, RecoveryPointCheckpointFile)))).toMap
  private val logStartOffsetCheckpoints = logDirs.map(dir => (dir, new OffsetCheckpointFile(new File(dir, LogStartOffsetCheckpointFile)))).toMap
//  根据每一个目录,生成一个线程池,线程池的大小是num.recovery.threads.per.data.dir配置的值,
//  读取每个目录下的topic-partitionid的目录,并根据zk中针对此topic的配置文件(或者默认的配置文件),通过offset-checkpoint中记录的此partition对应的offset,生成Log实例.并通过线程池来执行Log实例的加载,也就是日志的恢复.

  loadLogs()

  // public, so we can access this from kafka.admin.DeleteTopicTest
  val cleaner: LogCleaner =
    if(cleanerConfig.enableCleaner)
      new LogCleaner(cleanerConfig, logDirs, logs, time = time)
    else
      null

3.4 Clean up expired logs

/**
  * Runs through the log removing segments older than a certain age
  */
 private def cleanupExpiredSegments(log: Log): Int = {
   if (log.config.retentionMs < 0)
     return 0
   val startMs = time.milliseconds
   log.deleteOldSegments(startMs - _.lastModified > log.config.retentionMs)
 }

This block involves another configuration: retention.ms, this parameter indicates the time the log is saved. If it is less than 0, it means that it will never fail, and there is no such thing as deletion.

Of course, if the modification time of the file is different from the current time and is greater than the set log storage time, the deletion action must be performed. The specific deletion method is:

  /**
   * Delete any log segments matching the given predicate function,
   * starting with the oldest segment and moving forward until a segment doesn't match.
   * @param predicate A function that takes in a single log segment and returns true iff it is deletable
   * @return The number of segments deleted
   */
  def deleteOldSegments(predicate: LogSegment => Boolean): Int = {
    lock synchronized {
      //find any segments that match the user-supplied predicate UNLESS it is the final segment
      //and it is empty (since we would just end up re-creating it)
      val lastEntry = segments.lastEntry
      val deletable =
        if (lastEntry == null) Seq.empty
        else logSegments.takeWhile(s => predicate(s) && (s.baseOffset != lastEntry.getValue.baseOffset || s.size > 0))
      val numToDelete = deletable.size
      if (numToDelete > 0) {
        // we must always have at least one segment, so if we are going to delete all the segments, create a new one first
        if (segments.size == numToDelete)
          roll()
        // remove the segments for lookups
        deletable.foreach(deleteSegment(_))
      }
      numToDelete
    }
  }

The logic of this block is: according to the incoming predicate, determine which logs meet the requirements to be deleted, put them into the deleteable, and finally traverse the deleteable to delete them.

private def deleteSegment(segment: LogSegment) {
  info("Scheduling log segment %d for log %s for deletion.".format(segment.baseOffset, name))
  lock synchronized {
    segments.remove(segment.baseOffset)
    asyncDeleteSegment(segment)
  }
}

  private def asyncDeleteSegment(segment: LogSegment) {
  segment.changeFileSuffixes("", Log.DeletedFileSuffix)
  def deleteSeg() {
    info("Deleting segment %d from log %s.".format(segment.baseOffset, name))
    segment.delete()
  }
  scheduler.schedule("delete-file", deleteSeg, delay = config.fileDeleteDelayMs)
}

This block is a process of asynchronously deleting files, including a configuration: file.delete.delay.ms. Indicates how often log files are deleted. The deletion process is to first change the suffix of the log to .delete, and then delete it regularly.

3.5 Clean up oversized logs

/**
  *  Runs through the log removing segments until the size of the log
  *  is at least logRetentionSize bytes in size
  */
 private def cleanupSegmentsToMaintainSize(log: Log): Int = {
   if(log.config.retentionSize < 0 || log.size < log.config.retentionSize)
     return 0
   var diff = log.size - log.config.retentionSize
   def shouldDelete(segment: LogSegment) = {
     if(diff - segment.size >= 0) {
       diff -= segment.size
       true
     } else {
       false
     }
   }
   log.deleteOldSegments(shouldDelete)
 }

This code is relatively clear. If the log size is greater than retention.bytes, it will be marked as to be deleted, and then the method called is the same, which is also deleteOldSegments. I won't go into details.

3.6 Regularly flush the log disk buffer:

This function regularly executes the flushDirtyLogs function in LogManager through the background scheduling component,

This function iterates the logs of all partitions and performs the flush operation. In this operation, the segment between the offset of the last checkpoint and the current offset is found through the current last offset, and the flush operation of the log and index in the segment is performed. For the log file, execute the force function of the file pipeline, and for the index file, execute the force function of the file pipeline map.

private def flushDirtyLogs() = {
  debug("Checking for dirty logs to flush...")

  for ((topicAndPartition, log) <- logs) {
    try {
      val timeSinceLastFlush = time.milliseconds - log.lastFlushTime
      debug("Checking if flush is needed on " + topicAndPartition.topic 

           + " flush interval  " + log.config.flushMs +
            " last flushed " + log.lastFlushTime + " time since last flush: " 

           + timeSinceLastFlush)
      if(timeSinceLastFlush >= log.config.flushMs)
        log.flush
    } catch {
      case e: Throwable =>
        error("Error flushing topic " + topicAndPartition.topic, e)
    }
  }
}

3.7 Regularly checkpoint the offset of the partition:

This goes regularly through the scheduling component in the background

Execute the function of checkpointRecoveryPointOffsets in LogManager,

def checkpointRecoveryPointOffsets() {
  this.logDirs.foreach(checkpointLogsInDir)
}

Here the checkpoint operation is performed on the last offset of the partition stored in each dir.

In this function, the offset of the corresponding partition in each dir is iterated and recorded in the checkpoint file in the corresponding directory.

The first line writes a 0, indicating that it is the version of the checkpoint file.

The second line writes the number of partitions. At the current checkpoint, the number of partitions that already have data in this dir.

The following data corresponds to the number of pieces of the second value, and each piece of data is written to the value of the topic partition offset.

private def checkpointLogsInDir(dir: File): Unit = {
  val recoveryPoints = this.logsByDir.get(dir.toString)
  if (recoveryPoints.isDefined) {
    this.recoveryPointCheckpoints(dir).write(recoveryPoints.get.mapValues(

        _.recoveryPoint))
  }
}


In the LogCleaner instance, log compression is performed periodically: In this instance, it is processed by the thread of the CleanerThread:

  1. The configuration item log.cleaner.io.max.bytes.per.second is used to control the IO speed of this thread operation, and the speed is not controlled by default
  2. The configuration item log.cleaner.dedupe.buffer.size, the default value is 128MB, is used to configure the memory buffer for cleaning expired data. This is used for data cleaning, and when the compression method is selected, it is used for cleaning and sorting memory for duplicate data.
  3. The configuration item log.cleaner.threads, the default value is 1. It is used to configure the number of threads for cleaning expired logs.
  4. The configuration item log.cleaner.backoff.ms is used to periodically check whether the log needs to be cleaned up. The default is 15 seconds.

Enter image description

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325257342&siteId=291194637