AV1 code learning: av1_first_pass function

In AV1, the default is divided into two encoding processes, the first encoding process is mainly used for statistics to speed up the second encoding. The entry function for the first encoding is the av1_first_pass function. The process of the first encoding is roughly as follows:

Only the luminance component is encoded in the first encoding.

In the first encoding, each frame is encoded in raster scan order, and the size of each block is 16x16.

In the firstpass_intra_prediction function, perform the intra prediction in the first process. The intra prediction mode used is the DC mode, and the intra prediction error (the sum of squares of the intra prediction residuals) is returned.

In the firstpass_inter_prediction function, the inter prediction in the first pass is performed, and it only uses two reference frames, LAST_FRAME and GOLDEN_FRAME, and returns the inter prediction error.

After predicting the entire frame of the current period, call the update_firstpass_stats function to update the statistics of the first encoding process. In this function, update twopass->total_stats (cumulative statistics) and twopass->stats_buf_ctx->stats_in_end (pointing to the current statistics). pointer)

The statistical information of the first process is as follows:

typedef struct {
  // Frame number in display order, if stats are for a single frame.
  // No real meaning for a collection of frames.
  // 如果统计信息针对单个帧,则按显示顺序显示帧编号。对于帧集合没有真正的意义。
  double frame;
  // Weight assigned to this frame (or total weight for the collection of
  // frames) currently based on intra factor and brightness factor. This is used
  // to distribute bits betweeen easier and harder frames.
  // 当前基于帧内因子和亮度因子分配给该帧的权重(或帧集合的总权重)。
  // 这用于在更简单和更难的帧之间分配位。
  double weight;
  // Intra prediction error.
  // 帧内预测误差(帧内预测残差的平方和)
  double intra_error;
  // Average wavelet energy computed using Discrete Wavelet Transform (DWT).
  // 利用离散小波变换(DWT)计算平均小波能量。
  double frame_avg_wavelet_energy;
  // Best of intra pred error and inter pred error using last frame as ref.
  // 以最后一帧为参考的帧间预测误差和帧内预测误差的最佳值
  double coded_error;
  // Best of intra pred error and inter pred error using golden frame as ref.
  // 以黄金帧为参考的帧间预测误差和帧内预测误差的最佳值
  double sr_coded_error;
  // Best of intra pred error and inter pred error using altref frame as ref.
  // 以altref帧为参考帧的的帧间预测误差和帧内预测误差的最佳值
  double tr_coded_error;
  // Percentage of blocks with inter pred error < intra pred error.
  // 帧间预测误差<帧内预测误差的块百分比。
  double pcnt_inter;
  // Percentage of blocks using (inter prediction and) non-zero motion vectors.
  // 使用(帧间预测和)非零运动矢量的块的百分比。
  double pcnt_motion;
  // Percentage of blocks where golden frame was better than last or intra:
  // inter pred error using golden frame < inter pred error using last frame and
  // inter pred error using golden frame < intra pred error
  // golden frame优于last或intra的块百分比:
  // 使用golden frame的inter pred error<使用last frame的inter pred error和
  // 使用golden frame的inter pred error<intra pred error
  double pcnt_second_ref;
  // Percentage of blocks where altref frame was better than intra, last, golden
  // alt ref帧优于intra、last、golden的块百分比
  double pcnt_third_ref;
  // Percentage of blocks where intra and inter prediction errors were very
  // close. Note that this is a 'weighted count', that is, the so blocks may be
  // weighted by how close the two errors were.
  // 帧内和帧间预测误差非常接近的块的百分比
  // 请注意,这是一个“加权计数”,也就是说,这样的块可以通过两个错误的接近程度来加权。
  double pcnt_neutral;
  // Percentage of blocks that have almost no intra error residual
  // (i.e. are in effect completely flat and untextured in the intra
  // domain). In natural videos this is uncommon, but it is much more
  // common in animations, graphics and screen content, so may be used
  // as a signal to detect these types of content.
  // 几乎没有帧内错误残差的块的百分比(即实际上在域内完全平坦且没有纹理)
  // 在自然视频中这是不常见的,但它在动画、图形和屏幕内容中更为常见,
  // 因此可以用作检测这些类型内容的信号。
  double intra_skip_pct;
  // Image mask rows top and bottom.
  double inactive_zone_rows;
  // Image mask columns at left and right edges.
  double inactive_zone_cols;
  // Average of row motion vectors.行运动矢量的平均值。
  double MVr;
  // Mean of absolute value of row motion vectors.行运动矢量绝对值的平均值。
  double mvr_abs;
  // Mean of column motion vectors.
  double MVc;
  // Mean of absolute value of column motion vectors.
  double mvc_abs;
  // Variance of row motion vectors.
  double MVrv;
  // Variance of column motion vectors.
  double MVcv;
  // Value in range [-1,1] indicating fraction of row and column motion vectors
  // that point inwards (negative MV value) or outwards (positive MV value).
  // For example, value of 1 indicates, all row/column MVs are inwards.
  // 范围[-1,1]中的值,表示向内(负MV值)或向外(正MV值)的行和列运动矢量的分数。
  // 例如,值为1表示所有行 / 列mv都向内。
  double mv_in_out_count;
  // Count of unique non-zero motion vectors.唯一非零运动矢量的计数。
  double new_mv_count;
  // Duration of the frame / collection of frames.帧的持续时间/帧集合。
  double duration;
  // 1.0 if stats are for a single frame, OR
  // Number of frames in this collection for which the stats are accumulated.
  // 1.0 如果统计信息是针对单个帧的,或此集合中累积统计信息的帧数。
  double count;
  // standard deviation for (0, 0) motion prediction error (0,0)运动预测误差的标准差
  double raw_error_stdev;
} FIRSTPASS_STATS;

The code and comments are as follows: 

#define FIRST_PASS_ALT_REF_DISTANCE 16
void av1_first_pass(AV1_COMP *cpi, const int64_t ts_duration) {
  MACROBLOCK *const x = &cpi->td.mb;
  AV1_COMMON *const cm = &cpi->common;
  const CommonModeInfoParams *const mi_params = &cm->mi_params;
  CurrentFrame *const current_frame = &cm->current_frame;
  const SequenceHeader *const seq_params = &cm->seq_params;
  const int num_planes = av1_num_planes(cm);
  MACROBLOCKD *const xd = &x->e_mbd;
  const PICK_MODE_CONTEXT *ctx = &cpi->td.pc_root->none;
  MV last_mv = kZeroMv;
  const int qindex = find_fp_qindex(seq_params->bit_depth);
  // Detect if the key frame is screen content type.
  if (frame_is_intra_only(cm)) {
    FeatureFlags *const features = &cm->features;
    av1_set_screen_content_options(cpi, features);
    cpi->is_screen_content_type = features->allow_screen_content_tools;
  }
  // First pass coding proceeds in raster scan order with unit size of 16x16.
  // 第一遍编码按光栅扫描顺序进行,单位尺寸为16x16。
  const BLOCK_SIZE fp_block_size = BLOCK_16X16;
  const int fp_block_size_width = block_size_high[fp_block_size];
  const int fp_block_size_height = block_size_wide[fp_block_size];
  int *raw_motion_err_list;
  int raw_motion_err_counts = 0;
  CHECK_MEM_ERROR(cm, raw_motion_err_list,
                  aom_calloc(mi_params->mb_rows * mi_params->mb_cols,
                             sizeof(*raw_motion_err_list)));
  // Tiling is ignored in the first pass.
  // 在first pass忽略Tiling
  TileInfo tile;
  av1_tile_init(&tile, cm, 0, 0);
  FRAME_STATS stats = { 0 };
  stats.image_data_start_row = INVALID_ROW;

  const YV12_BUFFER_CONFIG *const last_frame =
      get_ref_frame_yv12_buf(cm, LAST_FRAME);//最近的参考帧 
  const YV12_BUFFER_CONFIG *golden_frame =
      get_ref_frame_yv12_buf(cm, GOLDEN_FRAME);//黄金帧
  const YV12_BUFFER_CONFIG *alt_ref_frame = NULL;
  const int alt_ref_offset =
      FIRST_PASS_ALT_REF_DISTANCE -
      (current_frame->frame_number % FIRST_PASS_ALT_REF_DISTANCE);
  if (alt_ref_offset < FIRST_PASS_ALT_REF_DISTANCE) {
    const struct lookahead_entry *const alt_ref_frame_buffer =
        av1_lookahead_peek(cpi->lookahead, alt_ref_offset,
                           cpi->compressor_stage);
    if (alt_ref_frame_buffer != NULL) {
      alt_ref_frame = &alt_ref_frame_buffer->img;
    }
  }
  YV12_BUFFER_CONFIG *const this_frame = &cm->cur_frame->buf;//当前帧
  // First pass code requires valid last and new frame buffers.
  // first pass编码需要有效的最后帧缓冲区和新帧缓冲区。
  assert(this_frame != NULL);
  assert(frame_is_intra_only(cm) || (last_frame != NULL));

  av1_setup_frame_size(cpi);
  aom_clear_system_state();

  set_mi_offsets(mi_params, xd, 0, 0);
  xd->mi[0]->sb_type = fp_block_size;

  // Do not use periodic key frames.
  // 不要使用周期性的关键帧。
  cpi->rc.frames_to_key = INT_MAX;

  av1_set_quantizer(cm, cpi->oxcf.qm_minlevel, cpi->oxcf.qm_maxlevel, qindex);

  av1_setup_block_planes(xd, seq_params->subsampling_x,
                         seq_params->subsampling_y, num_planes);

  av1_setup_src_planes(x, cpi->source, 0, 0, num_planes, fp_block_size);
  av1_setup_dst_planes(xd->plane, seq_params->sb_size, this_frame, 0, 0, 0,
                       num_planes);

  if (!frame_is_intra_only(cm)) {
    av1_setup_pre_planes(xd, 0, last_frame, 0, 0, NULL, num_planes);
  }

  set_mi_offsets(mi_params, xd, 0, 0);

  // Don't store luma on the fist pass since chroma is not computed
  // 不要在first pass存储亮度,因为不计算色度
  xd->cfl.store_y = 0;
  av1_frame_init_quantizer(cpi);

  for (int i = 0; i < num_planes; ++i) {
    x->plane[i].coeff = ctx->coeff[i];
    x->plane[i].qcoeff = ctx->qcoeff[i];
    x->plane[i].eobs = ctx->eobs[i];
    x->plane[i].txb_entropy_ctx = ctx->txb_entropy_ctx[i];
    xd->plane[i].dqcoeff = ctx->dqcoeff[i];
  }

  av1_init_mv_probs(cm);
  av1_initialize_rd_consts(cpi);

  const int src_y_stride = cpi->source->y_stride;
  const int recon_y_stride = this_frame->y_stride;
  const int recon_uv_stride = this_frame->uv_stride;
  const int uv_mb_height =
      fp_block_size_height >> (this_frame->y_height > this_frame->uv_height);

  for (int mb_row = 0; mb_row < mi_params->mb_rows; ++mb_row) {
    MV best_ref_mv = kZeroMv;

    // Reset above block coeffs.
    xd->up_available = (mb_row != 0);
    int recon_yoffset = (mb_row * recon_y_stride * fp_block_size_height);
    int src_yoffset = (mb_row * src_y_stride * fp_block_size_height);
    int recon_uvoffset = (mb_row * recon_uv_stride * uv_mb_height);
    int alt_ref_frame_yoffset =
        (alt_ref_frame != NULL)
            ? mb_row * alt_ref_frame->y_stride * fp_block_size_height
            : -1;

    // Set up limit values for motion vectors to prevent them extending
    // outside the UMV borders.
    av1_set_mv_row_limits(mi_params, &x->mv_limits, (mb_row << 2),
                          (fp_block_size_height >> MI_SIZE_LOG2),
                          cpi->oxcf.border_in_pixels);

    for (int mb_col = 0; mb_col < mi_params->mb_cols; ++mb_col) {
      int this_intra_error = firstpass_intra_prediction(
          cpi, this_frame, &tile, mb_row, mb_col, recon_yoffset, recon_uvoffset,
          fp_block_size, qindex, &stats); //返回帧内预测误差

      if (!frame_is_intra_only(cm)) {
        const int this_inter_error = firstpass_inter_prediction(
            cpi, last_frame, golden_frame, alt_ref_frame, mb_row, mb_col,
            recon_yoffset, recon_uvoffset, src_yoffset, alt_ref_frame_yoffset,
            fp_block_size, this_intra_error, raw_motion_err_counts,
            raw_motion_err_list, &best_ref_mv, &last_mv, &stats);
        stats.coded_error += this_inter_error;
        ++raw_motion_err_counts;
      } else {
        stats.sr_coded_error += this_intra_error;
        stats.tr_coded_error += this_intra_error;
        stats.coded_error += this_intra_error;
      }

      // Adjust to the next column of MBs.
      x->plane[0].src.buf += fp_block_size_width;
      x->plane[1].src.buf += uv_mb_height;
      x->plane[2].src.buf += uv_mb_height;

      recon_yoffset += fp_block_size_width;
      src_yoffset += fp_block_size_width;
      recon_uvoffset += uv_mb_height;
      alt_ref_frame_yoffset += fp_block_size_width;
    }
    // Adjust to the next row of MBs.
    x->plane[0].src.buf += fp_block_size_height * x->plane[0].src.stride -
                           fp_block_size_width * mi_params->mb_cols;
    x->plane[1].src.buf += uv_mb_height * x->plane[1].src.stride -
                           uv_mb_height * mi_params->mb_cols;
    x->plane[2].src.buf += uv_mb_height * x->plane[1].src.stride -
                           uv_mb_height * mi_params->mb_cols;
  }
  const double raw_err_stdev =
      raw_motion_error_stdev(raw_motion_err_list, raw_motion_err_counts);
  aom_free(raw_motion_err_list);

  // Clamp the image start to rows/2. This number of rows is discarded top
  // and bottom as dead data so rows / 2 means the frame is blank.
  //将图像开始夹持到第rows/2行。这个行数在顶部和底部作为死数据丢弃,因此rows/2表示帧为空。
  if ((stats.image_data_start_row > mi_params->mb_rows / 2) ||
      (stats.image_data_start_row == INVALID_ROW)) {
    stats.image_data_start_row = mi_params->mb_rows / 2;
  }
  // Exclude any image dead zone
  if (stats.image_data_start_row > 0) {
    stats.intra_skip_count =
        AOMMAX(0, stats.intra_skip_count -
                      (stats.image_data_start_row * mi_params->mb_cols * 2));
  }

  TWO_PASS *twopass = &cpi->twopass;
  const int num_mbs = (cpi->oxcf.resize_mode != RESIZE_NONE) ? cpi->initial_mbs
                                                             : mi_params->MBs;
  stats.intra_factor = stats.intra_factor / (double)num_mbs;
  stats.brightness_factor = stats.brightness_factor / (double)num_mbs;
  FIRSTPASS_STATS *this_frame_stats = twopass->stats_buf_ctx->stats_in_end;
  update_firstpass_stats(cpi, &stats, raw_err_stdev,
                         current_frame->frame_number, ts_duration);

  // Copy the previous Last Frame back into gf buffer if the prediction is good
  // enough... but also don't allow it to lag too far.
  // 如果预测足够好,将上一帧复制回gf缓冲区...但也不要让它拖得太远... 
  if ((twopass->sr_update_lag > 3) ||
      ((current_frame->frame_number > 0) &&
       (this_frame_stats->pcnt_inter > 0.20) &&
       ((this_frame_stats->intra_error /
         DOUBLE_DIVIDE_CHECK(this_frame_stats->coded_error)) > 2.0))) {
    if (golden_frame != NULL) {
      assign_frame_buffer_p(
          &cm->ref_frame_map[get_ref_frame_map_idx(cm, GOLDEN_FRAME)],
          cm->ref_frame_map[get_ref_frame_map_idx(cm, LAST_FRAME)]);
    }
    twopass->sr_update_lag = 1;
  } else {
    ++twopass->sr_update_lag;
  }

  aom_extend_frame_borders(this_frame, num_planes);

  // The frame we just compressed now becomes the last frame.我们刚才压缩的帧现在成为最后一帧。
  assign_frame_buffer_p(
      &cm->ref_frame_map[get_ref_frame_map_idx(cm, LAST_FRAME)], cm->cur_frame);

  // Special case for the first frame. Copy into the GF buffer as a second
  // reference.第一帧的特殊情况。复制到GF缓冲区作为第二个引用。
  if (current_frame->frame_number == 0 &&
      get_ref_frame_map_idx(cm, GOLDEN_FRAME) != INVALID_IDX) {
    assign_frame_buffer_p(
        &cm->ref_frame_map[get_ref_frame_map_idx(cm, GOLDEN_FRAME)],
        cm->ref_frame_map[get_ref_frame_map_idx(cm, LAST_FRAME)]);
  }

  print_reconstruction_frame(last_frame, current_frame->frame_number,
                             /*do_print=*/0);

  ++current_frame->frame_number;
}

 

Guess you like

Origin blog.csdn.net/BigDream123/article/details/109500527