AV1代码学习：av1_first_pass函数

在AV1中，默认是分为两次编码过程，第一次编码过程主要是用来统计信息从而加速第二次编码。第一次编码的入口函数就是av1_first_pass函数，第一次编码的过程大致如下所示：

第一次编码仅会对亮度分量进行编码。

在第一次编码中，对每一帧按照按光栅扫描顺序进行编码，并且每一个块的尺寸为16x16。

在firstpass_intra_prediction函数中进行第一次过程中的帧内预测，其所使用的的帧内预测模式为DC模式，并返回帧内预测误差（帧内预测残差的平方和）。

在firstpass_inter_prediction函数中进行第一次过程中的帧间预测，并且其仅使用LAST_FRAME和GOLDEN_FRAME两种参考帧，并返回帧间预测误差。

在对当期整幅帧预测结束之后，调用update_firstpass_stats函数更新第一次编码过程的统计信息，在该函数中更新 twopass->total_stats(累计统计)和twopass->stats_buf_ctx->stats_in_end(指向当前统计信息的指针)

第一次过程的的统计信息如下：

typedef struct {
  // Frame number in display order, if stats are for a single frame.
  // No real meaning for a collection of frames.
  // 如果统计信息针对单个帧，则按显示顺序显示帧编号。对于帧集合没有真正的意义。
  double frame;
  // Weight assigned to this frame (or total weight for the collection of
  // frames) currently based on intra factor and brightness factor. This is used
  // to distribute bits betweeen easier and harder frames.
  // 当前基于帧内因子和亮度因子分配给该帧的权重（或帧集合的总权重）。
  // 这用于在更简单和更难的帧之间分配位。
  double weight;
  // Intra prediction error.
  // 帧内预测误差（帧内预测残差的平方和）
  double intra_error;
  // Average wavelet energy computed using Discrete Wavelet Transform (DWT).
  // 利用离散小波变换（DWT）计算平均小波能量。
  double frame_avg_wavelet_energy;
  // Best of intra pred error and inter pred error using last frame as ref.
  // 以最后一帧为参考的帧间预测误差和帧内预测误差的最佳值
  double coded_error;
  // Best of intra pred error and inter pred error using golden frame as ref.
  // 以黄金帧为参考的帧间预测误差和帧内预测误差的最佳值
  double sr_coded_error;
  // Best of intra pred error and inter pred error using altref frame as ref.
  // 以altref帧为参考帧的的帧间预测误差和帧内预测误差的最佳值
  double tr_coded_error;
  // Percentage of blocks with inter pred error < intra pred error.
  // 帧间预测误差<帧内预测误差的块百分比。
  double pcnt_inter;
  // Percentage of blocks using (inter prediction and) non-zero motion vectors.
  // 使用（帧间预测和）非零运动矢量的块的百分比。
  double pcnt_motion;
  // Percentage of blocks where golden frame was better than last or intra:
  // inter pred error using golden frame < inter pred error using last frame and
  // inter pred error using golden frame < intra pred error
  // golden frame优于last或intra的块百分比：
  // 使用golden frame的inter pred error<使用last frame的inter pred error和
  // 使用golden frame的inter pred error<intra pred error
  double pcnt_second_ref;
  // Percentage of blocks where altref frame was better than intra, last, golden
  // alt ref帧优于intra、last、golden的块百分比
  double pcnt_third_ref;
  // Percentage of blocks where intra and inter prediction errors were very
  // close. Note that this is a 'weighted count', that is, the so blocks may be
  // weighted by how close the two errors were.
  // 帧内和帧间预测误差非常接近的块的百分比
  // 请注意，这是一个“加权计数”，也就是说，这样的块可以通过两个错误的接近程度来加权。
  double pcnt_neutral;
  // Percentage of blocks that have almost no intra error residual
  // (i.e. are in effect completely flat and untextured in the intra
  // domain). In natural videos this is uncommon, but it is much more
  // common in animations, graphics and screen content, so may be used
  // as a signal to detect these types of content.
  // 几乎没有帧内错误残差的块的百分比（即实际上在域内完全平坦且没有纹理）
  // 在自然视频中这是不常见的，但它在动画、图形和屏幕内容中更为常见，
  // 因此可以用作检测这些类型内容的信号。
  double intra_skip_pct;
  // Image mask rows top and bottom.
  double inactive_zone_rows;
  // Image mask columns at left and right edges.
  double inactive_zone_cols;
  // Average of row motion vectors.行运动矢量的平均值。
  double MVr;
  // Mean of absolute value of row motion vectors.行运动矢量绝对值的平均值。
  double mvr_abs;
  // Mean of column motion vectors.
  double MVc;
  // Mean of absolute value of column motion vectors.
  double mvc_abs;
  // Variance of row motion vectors.
  double MVrv;
  // Variance of column motion vectors.
  double MVcv;
  // Value in range [-1,1] indicating fraction of row and column motion vectors
  // that point inwards (negative MV value) or outwards (positive MV value).
  // For example, value of 1 indicates, all row/column MVs are inwards.
  // 范围[-1,1]中的值，表示向内（负MV值）或向外（正MV值）的行和列运动矢量的分数。
  // 例如，值为1表示所有行 / 列mv都向内。
  double mv_in_out_count;
  // Count of unique non-zero motion vectors.唯一非零运动矢量的计数。
  double new_mv_count;
  // Duration of the frame / collection of frames.帧的持续时间/帧集合。
  double duration;
  // 1.0 if stats are for a single frame, OR
  // Number of frames in this collection for which the stats are accumulated.
  // 1.0 如果统计信息是针对单个帧的，或此集合中累积统计信息的帧数。
  double count;
  // standard deviation for (0, 0) motion prediction error （0，0）运动预测误差的标准差
  double raw_error_stdev;
} FIRSTPASS_STATS;

代码和注释如下：

#define FIRST_PASS_ALT_REF_DISTANCE 16
void av1_first_pass(AV1_COMP *cpi, const int64_t ts_duration) {
  MACROBLOCK *const x = &cpi->td.mb;
  AV1_COMMON *const cm = &cpi->common;
  const CommonModeInfoParams *const mi_params = &cm->mi_params;
  CurrentFrame *const current_frame = &cm->current_frame;
  const SequenceHeader *const seq_params = &cm->seq_params;
  const int num_planes = av1_num_planes(cm);
  MACROBLOCKD *const xd = &x->e_mbd;
  const PICK_MODE_CONTEXT *ctx = &cpi->td.pc_root->none;
  MV last_mv = kZeroMv;
  const int qindex = find_fp_qindex(seq_params->bit_depth);
  // Detect if the key frame is screen content type.
  if (frame_is_intra_only(cm)) {
    FeatureFlags *const features = &cm->features;
    av1_set_screen_content_options(cpi, features);
    cpi->is_screen_content_type = features->allow_screen_content_tools;
  }
  // First pass coding proceeds in raster scan order with unit size of 16x16.
  // 第一遍编码按光栅扫描顺序进行，单位尺寸为16x16。
  const BLOCK_SIZE fp_block_size = BLOCK_16X16;
  const int fp_block_size_width = block_size_high[fp_block_size];
  const int fp_block_size_height = block_size_wide[fp_block_size];
  int *raw_motion_err_list;
  int raw_motion_err_counts = 0;
  CHECK_MEM_ERROR(cm, raw_motion_err_list,
                  aom_calloc(mi_params->mb_rows * mi_params->mb_cols,
                             sizeof(*raw_motion_err_list)));
  // Tiling is ignored in the first pass.
  // 在first pass忽略Tiling
  TileInfo tile;
  av1_tile_init(&tile, cm, 0, 0);
  FRAME_STATS stats = { 0 };
  stats.image_data_start_row = INVALID_ROW;

  const YV12_BUFFER_CONFIG *const last_frame =
      get_ref_frame_yv12_buf(cm, LAST_FRAME);//最近的参考帧 
  const YV12_BUFFER_CONFIG *golden_frame =
      get_ref_frame_yv12_buf(cm, GOLDEN_FRAME);//黄金帧
  const YV12_BUFFER_CONFIG *alt_ref_frame = NULL;
  const int alt_ref_offset =
      FIRST_PASS_ALT_REF_DISTANCE -
      (current_frame->frame_number % FIRST_PASS_ALT_REF_DISTANCE);
  if (alt_ref_offset < FIRST_PASS_ALT_REF_DISTANCE) {
    const struct lookahead_entry *const alt_ref_frame_buffer =
        av1_lookahead_peek(cpi->lookahead, alt_ref_offset,
                           cpi->compressor_stage);
    if (alt_ref_frame_buffer != NULL) {
      alt_ref_frame = &alt_ref_frame_buffer->img;
    }
  }
  YV12_BUFFER_CONFIG *const this_frame = &cm->cur_frame->buf;//当前帧
  // First pass code requires valid last and new frame buffers.
  // first pass编码需要有效的最后帧缓冲区和新帧缓冲区。
  assert(this_frame != NULL);
  assert(frame_is_intra_only(cm) || (last_frame != NULL));

  av1_setup_frame_size(cpi);
  aom_clear_system_state();

  set_mi_offsets(mi_params, xd, 0, 0);
  xd->mi[0]->sb_type = fp_block_size;

  // Do not use periodic key frames.
  // 不要使用周期性的关键帧。
  cpi->rc.frames_to_key = INT_MAX;

  av1_set_quantizer(cm, cpi->oxcf.qm_minlevel, cpi->oxcf.qm_maxlevel, qindex);

  av1_setup_block_planes(xd, seq_params->subsampling_x,
                         seq_params->subsampling_y, num_planes);

  av1_setup_src_planes(x, cpi->source, 0, 0, num_planes, fp_block_size);
  av1_setup_dst_planes(xd->plane, seq_params->sb_size, this_frame, 0, 0, 0,
                       num_planes);

  if (!frame_is_intra_only(cm)) {
    av1_setup_pre_planes(xd, 0, last_frame, 0, 0, NULL, num_planes);
  }

  set_mi_offsets(mi_params, xd, 0, 0);

  // Don't store luma on the fist pass since chroma is not computed
  // 不要在first pass存储亮度，因为不计算色度
  xd->cfl.store_y = 0;
  av1_frame_init_quantizer(cpi);

  for (int i = 0; i < num_planes; ++i) {
    x->plane[i].coeff = ctx->coeff[i];
    x->plane[i].qcoeff = ctx->qcoeff[i];
    x->plane[i].eobs = ctx->eobs[i];
    x->plane[i].txb_entropy_ctx = ctx->txb_entropy_ctx[i];
    xd->plane[i].dqcoeff = ctx->dqcoeff[i];
  }

  av1_init_mv_probs(cm);
  av1_initialize_rd_consts(cpi);

  const int src_y_stride = cpi->source->y_stride;
  const int recon_y_stride = this_frame->y_stride;
  const int recon_uv_stride = this_frame->uv_stride;
  const int uv_mb_height =
      fp_block_size_height >> (this_frame->y_height > this_frame->uv_height);

  for (int mb_row = 0; mb_row < mi_params->mb_rows; ++mb_row) {
    MV best_ref_mv = kZeroMv;

    // Reset above block coeffs.
    xd->up_available = (mb_row != 0);
    int recon_yoffset = (mb_row * recon_y_stride * fp_block_size_height);
    int src_yoffset = (mb_row * src_y_stride * fp_block_size_height);
    int recon_uvoffset = (mb_row * recon_uv_stride * uv_mb_height);
    int alt_ref_frame_yoffset =
        (alt_ref_frame != NULL)
            ? mb_row * alt_ref_frame->y_stride * fp_block_size_height
            : -1;

    // Set up limit values for motion vectors to prevent them extending
    // outside the UMV borders.
    av1_set_mv_row_limits(mi_params, &x->mv_limits, (mb_row << 2),
                          (fp_block_size_height >> MI_SIZE_LOG2),
                          cpi->oxcf.border_in_pixels);

    for (int mb_col = 0; mb_col < mi_params->mb_cols; ++mb_col) {
      int this_intra_error = firstpass_intra_prediction(
          cpi, this_frame, &tile, mb_row, mb_col, recon_yoffset, recon_uvoffset,
          fp_block_size, qindex, &stats); //返回帧内预测误差

      if (!frame_is_intra_only(cm)) {
        const int this_inter_error = firstpass_inter_prediction(
            cpi, last_frame, golden_frame, alt_ref_frame, mb_row, mb_col,
            recon_yoffset, recon_uvoffset, src_yoffset, alt_ref_frame_yoffset,
            fp_block_size, this_intra_error, raw_motion_err_counts,
            raw_motion_err_list, &best_ref_mv, &last_mv, &stats);
        stats.coded_error += this_inter_error;
        ++raw_motion_err_counts;
      } else {
        stats.sr_coded_error += this_intra_error;
        stats.tr_coded_error += this_intra_error;
        stats.coded_error += this_intra_error;
      }

      // Adjust to the next column of MBs.
      x->plane[0].src.buf += fp_block_size_width;
      x->plane[1].src.buf += uv_mb_height;
      x->plane[2].src.buf += uv_mb_height;

      recon_yoffset += fp_block_size_width;
      src_yoffset += fp_block_size_width;
      recon_uvoffset += uv_mb_height;
      alt_ref_frame_yoffset += fp_block_size_width;
    }
    // Adjust to the next row of MBs.
    x->plane[0].src.buf += fp_block_size_height * x->plane[0].src.stride -
                           fp_block_size_width * mi_params->mb_cols;
    x->plane[1].src.buf += uv_mb_height * x->plane[1].src.stride -
                           uv_mb_height * mi_params->mb_cols;
    x->plane[2].src.buf += uv_mb_height * x->plane[1].src.stride -
                           uv_mb_height * mi_params->mb_cols;
  }
  const double raw_err_stdev =
      raw_motion_error_stdev(raw_motion_err_list, raw_motion_err_counts);
  aom_free(raw_motion_err_list);

  // Clamp the image start to rows/2. This number of rows is discarded top
  // and bottom as dead data so rows / 2 means the frame is blank.
  //将图像开始夹持到第rows/2行。这个行数在顶部和底部作为死数据丢弃，因此rows/2表示帧为空。
  if ((stats.image_data_start_row > mi_params->mb_rows / 2) ||
      (stats.image_data_start_row == INVALID_ROW)) {
    stats.image_data_start_row = mi_params->mb_rows / 2;
  }
  // Exclude any image dead zone
  if (stats.image_data_start_row > 0) {
    stats.intra_skip_count =
        AOMMAX(0, stats.intra_skip_count -
                      (stats.image_data_start_row * mi_params->mb_cols * 2));
  }

  TWO_PASS *twopass = &cpi->twopass;
  const int num_mbs = (cpi->oxcf.resize_mode != RESIZE_NONE) ? cpi->initial_mbs
                                                             : mi_params->MBs;
  stats.intra_factor = stats.intra_factor / (double)num_mbs;
  stats.brightness_factor = stats.brightness_factor / (double)num_mbs;
  FIRSTPASS_STATS *this_frame_stats = twopass->stats_buf_ctx->stats_in_end;
  update_firstpass_stats(cpi, &stats, raw_err_stdev,
                         current_frame->frame_number, ts_duration);

  // Copy the previous Last Frame back into gf buffer if the prediction is good
  // enough... but also don't allow it to lag too far.
  // 如果预测足够好，将上一帧复制回gf缓冲区...但也不要让它拖得太远... 
  if ((twopass->sr_update_lag > 3) ||
      ((current_frame->frame_number > 0) &&
       (this_frame_stats->pcnt_inter > 0.20) &&
       ((this_frame_stats->intra_error /
         DOUBLE_DIVIDE_CHECK(this_frame_stats->coded_error)) > 2.0))) {
    if (golden_frame != NULL) {
      assign_frame_buffer_p(
          &cm->ref_frame_map[get_ref_frame_map_idx(cm, GOLDEN_FRAME)],
          cm->ref_frame_map[get_ref_frame_map_idx(cm, LAST_FRAME)]);
    }
    twopass->sr_update_lag = 1;
  } else {
    ++twopass->sr_update_lag;
  }

  aom_extend_frame_borders(this_frame, num_planes);

  // The frame we just compressed now becomes the last frame.我们刚才压缩的帧现在成为最后一帧。
  assign_frame_buffer_p(
      &cm->ref_frame_map[get_ref_frame_map_idx(cm, LAST_FRAME)], cm->cur_frame);

  // Special case for the first frame. Copy into the GF buffer as a second
  // reference.第一帧的特殊情况。复制到GF缓冲区作为第二个引用。
  if (current_frame->frame_number == 0 &&
      get_ref_frame_map_idx(cm, GOLDEN_FRAME) != INVALID_IDX) {
    assign_frame_buffer_p(
        &cm->ref_frame_map[get_ref_frame_map_idx(cm, GOLDEN_FRAME)],
        cm->ref_frame_map[get_ref_frame_map_idx(cm, LAST_FRAME)]);
  }

  print_reconstruction_frame(last_frame, current_frame->frame_number,
                             /*do_print=*/0);

  ++current_frame->frame_number;
}

AV1代码学习：av1_first_pass函数

猜你喜欢