Aprendizaje de código AV1: función av1_first_pass

En AV1, el valor predeterminado se divide en dos procesos de codificación, el primer proceso de codificación se utiliza principalmente para las estadísticas para acelerar la segunda codificación. La función de entrada para la primera codificación es la función av1_first_pass. El proceso de la primera codificación es aproximadamente el siguiente:

Solo el componente de luminancia se codifica en la primera codificación.

En la primera codificación, cada fotograma se codifica en orden de exploración de trama y el tamaño de cada bloque es 16x16.

En la función firstpass_intra_prediction , realice la predicción intra en el primer proceso. El modo de predicción intra utilizado es el modo DC, y se devuelve el error de predicción intra (la suma de cuadrados de los residuales intra predicción).

En la función firstpass_inter_prediction , se realiza la predicción inter en la primera pasada, y solo usa dos marcos de referencia, LAST_FRAME y GOLDEN_FRAME, y devuelve el error de predicción inter.

Después de predecir el marco completo del período actual, llame a la función update_firstpass_stats para actualizar las estadísticas del primer proceso de codificación. En esta función, actualice twopass-> total_stats (estadísticas acumulativas) y twopass-> stats_buf_ctx-> stats_in_end (apuntando a las estadísticas actuales puntero)

La información estadística del primer proceso es la siguiente:

typedef struct {
  // Frame number in display order, if stats are for a single frame.
  // No real meaning for a collection of frames.
  // 如果统计信息针对单个帧，则按显示顺序显示帧编号。对于帧集合没有真正的意义。
  double frame;
  // Weight assigned to this frame (or total weight for the collection of
  // frames) currently based on intra factor and brightness factor. This is used
  // to distribute bits betweeen easier and harder frames.
  // 当前基于帧内因子和亮度因子分配给该帧的权重（或帧集合的总权重）。
  // 这用于在更简单和更难的帧之间分配位。
  double weight;
  // Intra prediction error.
  // 帧内预测误差（帧内预测残差的平方和）
  double intra_error;
  // Average wavelet energy computed using Discrete Wavelet Transform (DWT).
  // 利用离散小波变换（DWT）计算平均小波能量。
  double frame_avg_wavelet_energy;
  // Best of intra pred error and inter pred error using last frame as ref.
  // 以最后一帧为参考的帧间预测误差和帧内预测误差的最佳值
  double coded_error;
  // Best of intra pred error and inter pred error using golden frame as ref.
  // 以黄金帧为参考的帧间预测误差和帧内预测误差的最佳值
  double sr_coded_error;
  // Best of intra pred error and inter pred error using altref frame as ref.
  // 以altref帧为参考帧的的帧间预测误差和帧内预测误差的最佳值
  double tr_coded_error;
  // Percentage of blocks with inter pred error < intra pred error.
  // 帧间预测误差<帧内预测误差的块百分比。
  double pcnt_inter;
  // Percentage of blocks using (inter prediction and) non-zero motion vectors.
  // 使用（帧间预测和）非零运动矢量的块的百分比。
  double pcnt_motion;
  // Percentage of blocks where golden frame was better than last or intra:
  // inter pred error using golden frame < inter pred error using last frame and
  // inter pred error using golden frame < intra pred error
  // golden frame优于last或intra的块百分比：
  // 使用golden frame的inter pred error<使用last frame的inter pred error和
  // 使用golden frame的inter pred error<intra pred error
  double pcnt_second_ref;
  // Percentage of blocks where altref frame was better than intra, last, golden
  // alt ref帧优于intra、last、golden的块百分比
  double pcnt_third_ref;
  // Percentage of blocks where intra and inter prediction errors were very
  // close. Note that this is a 'weighted count', that is, the so blocks may be
  // weighted by how close the two errors were.
  // 帧内和帧间预测误差非常接近的块的百分比
  // 请注意，这是一个“加权计数”，也就是说，这样的块可以通过两个错误的接近程度来加权。
  double pcnt_neutral;
  // Percentage of blocks that have almost no intra error residual
  // (i.e. are in effect completely flat and untextured in the intra
  // domain). In natural videos this is uncommon, but it is much more
  // common in animations, graphics and screen content, so may be used
  // as a signal to detect these types of content.
  // 几乎没有帧内错误残差的块的百分比（即实际上在域内完全平坦且没有纹理）
  // 在自然视频中这是不常见的，但它在动画、图形和屏幕内容中更为常见，
  // 因此可以用作检测这些类型内容的信号。
  double intra_skip_pct;
  // Image mask rows top and bottom.
  double inactive_zone_rows;
  // Image mask columns at left and right edges.
  double inactive_zone_cols;
  // Average of row motion vectors.行运动矢量的平均值。
  double MVr;
  // Mean of absolute value of row motion vectors.行运动矢量绝对值的平均值。
  double mvr_abs;
  // Mean of column motion vectors.
  double MVc;
  // Mean of absolute value of column motion vectors.
  double mvc_abs;
  // Variance of row motion vectors.
  double MVrv;
  // Variance of column motion vectors.
  double MVcv;
  // Value in range [-1,1] indicating fraction of row and column motion vectors
  // that point inwards (negative MV value) or outwards (positive MV value).
  // For example, value of 1 indicates, all row/column MVs are inwards.
  // 范围[-1,1]中的值，表示向内（负MV值）或向外（正MV值）的行和列运动矢量的分数。
  // 例如，值为1表示所有行 / 列mv都向内。
  double mv_in_out_count;
  // Count of unique non-zero motion vectors.唯一非零运动矢量的计数。
  double new_mv_count;
  // Duration of the frame / collection of frames.帧的持续时间/帧集合。
  double duration;
  // 1.0 if stats are for a single frame, OR
  // Number of frames in this collection for which the stats are accumulated.
  // 1.0 如果统计信息是针对单个帧的，或此集合中累积统计信息的帧数。
  double count;
  // standard deviation for (0, 0) motion prediction error （0，0）运动预测误差的标准差
  double raw_error_stdev;
} FIRSTPASS_STATS;

El código y los comentarios son los siguientes:

#define FIRST_PASS_ALT_REF_DISTANCE 16
void av1_first_pass(AV1_COMP *cpi, const int64_t ts_duration) {
  MACROBLOCK *const x = &cpi->td.mb;
  AV1_COMMON *const cm = &cpi->common;
  const CommonModeInfoParams *const mi_params = &cm->mi_params;
  CurrentFrame *const current_frame = &cm->current_frame;
  const SequenceHeader *const seq_params = &cm->seq_params;
  const int num_planes = av1_num_planes(cm);
  MACROBLOCKD *const xd = &x->e_mbd;
  const PICK_MODE_CONTEXT *ctx = &cpi->td.pc_root->none;
  MV last_mv = kZeroMv;
  const int qindex = find_fp_qindex(seq_params->bit_depth);
  // Detect if the key frame is screen content type.
  if (frame_is_intra_only(cm)) {
    FeatureFlags *const features = &cm->features;
    av1_set_screen_content_options(cpi, features);
    cpi->is_screen_content_type = features->allow_screen_content_tools;
  }
  // First pass coding proceeds in raster scan order with unit size of 16x16.
  // 第一遍编码按光栅扫描顺序进行，单位尺寸为16x16。
  const BLOCK_SIZE fp_block_size = BLOCK_16X16;
  const int fp_block_size_width = block_size_high[fp_block_size];
  const int fp_block_size_height = block_size_wide[fp_block_size];
  int *raw_motion_err_list;
  int raw_motion_err_counts = 0;
  CHECK_MEM_ERROR(cm, raw_motion_err_list,
                  aom_calloc(mi_params->mb_rows * mi_params->mb_cols,
                             sizeof(*raw_motion_err_list)));
  // Tiling is ignored in the first pass.
  // 在first pass忽略Tiling
  TileInfo tile;
  av1_tile_init(&tile, cm, 0, 0);
  FRAME_STATS stats = { 0 };
  stats.image_data_start_row = INVALID_ROW;

  const YV12_BUFFER_CONFIG *const last_frame =
      get_ref_frame_yv12_buf(cm, LAST_FRAME);//最近的参考帧 
  const YV12_BUFFER_CONFIG *golden_frame =
      get_ref_frame_yv12_buf(cm, GOLDEN_FRAME);//黄金帧
  const YV12_BUFFER_CONFIG *alt_ref_frame = NULL;
  const int alt_ref_offset =
      FIRST_PASS_ALT_REF_DISTANCE -
      (current_frame->frame_number % FIRST_PASS_ALT_REF_DISTANCE);
  if (alt_ref_offset < FIRST_PASS_ALT_REF_DISTANCE) {
    const struct lookahead_entry *const alt_ref_frame_buffer =
        av1_lookahead_peek(cpi->lookahead, alt_ref_offset,
                           cpi->compressor_stage);
    if (alt_ref_frame_buffer != NULL) {
      alt_ref_frame = &alt_ref_frame_buffer->img;
    }
  }
  YV12_BUFFER_CONFIG *const this_frame = &cm->cur_frame->buf;//当前帧
  // First pass code requires valid last and new frame buffers.
  // first pass编码需要有效的最后帧缓冲区和新帧缓冲区。
  assert(this_frame != NULL);
  assert(frame_is_intra_only(cm) || (last_frame != NULL));

  av1_setup_frame_size(cpi);
  aom_clear_system_state();

  set_mi_offsets(mi_params, xd, 0, 0);
  xd->mi[0]->sb_type = fp_block_size;

  // Do not use periodic key frames.
  // 不要使用周期性的关键帧。
  cpi->rc.frames_to_key = INT_MAX;

  av1_set_quantizer(cm, cpi->oxcf.qm_minlevel, cpi->oxcf.qm_maxlevel, qindex);

  av1_setup_block_planes(xd, seq_params->subsampling_x,
                         seq_params->subsampling_y, num_planes);

  av1_setup_src_planes(x, cpi->source, 0, 0, num_planes, fp_block_size);
  av1_setup_dst_planes(xd->plane, seq_params->sb_size, this_frame, 0, 0, 0,
                       num_planes);

  if (!frame_is_intra_only(cm)) {
    av1_setup_pre_planes(xd, 0, last_frame, 0, 0, NULL, num_planes);
  }

  set_mi_offsets(mi_params, xd, 0, 0);

  // Don't store luma on the fist pass since chroma is not computed
  // 不要在first pass存储亮度，因为不计算色度
  xd->cfl.store_y = 0;
  av1_frame_init_quantizer(cpi);

  for (int i = 0; i < num_planes; ++i) {
    x->plane[i].coeff = ctx->coeff[i];
    x->plane[i].qcoeff = ctx->qcoeff[i];
    x->plane[i].eobs = ctx->eobs[i];
    x->plane[i].txb_entropy_ctx = ctx->txb_entropy_ctx[i];
    xd->plane[i].dqcoeff = ctx->dqcoeff[i];
  }

  av1_init_mv_probs(cm);
  av1_initialize_rd_consts(cpi);

  const int src_y_stride = cpi->source->y_stride;
  const int recon_y_stride = this_frame->y_stride;
  const int recon_uv_stride = this_frame->uv_stride;
  const int uv_mb_height =
      fp_block_size_height >> (this_frame->y_height > this_frame->uv_height);

  for (int mb_row = 0; mb_row < mi_params->mb_rows; ++mb_row) {
    MV best_ref_mv = kZeroMv;

    // Reset above block coeffs.
    xd->up_available = (mb_row != 0);
    int recon_yoffset = (mb_row * recon_y_stride * fp_block_size_height);
    int src_yoffset = (mb_row * src_y_stride * fp_block_size_height);
    int recon_uvoffset = (mb_row * recon_uv_stride * uv_mb_height);
    int alt_ref_frame_yoffset =
        (alt_ref_frame != NULL)
            ? mb_row * alt_ref_frame->y_stride * fp_block_size_height
            : -1;

    // Set up limit values for motion vectors to prevent them extending
    // outside the UMV borders.
    av1_set_mv_row_limits(mi_params, &x->mv_limits, (mb_row << 2),
                          (fp_block_size_height >> MI_SIZE_LOG2),
                          cpi->oxcf.border_in_pixels);

    for (int mb_col = 0; mb_col < mi_params->mb_cols; ++mb_col) {
      int this_intra_error = firstpass_intra_prediction(
          cpi, this_frame, &tile, mb_row, mb_col, recon_yoffset, recon_uvoffset,
          fp_block_size, qindex, &stats); //返回帧内预测误差

      if (!frame_is_intra_only(cm)) {
        const int this_inter_error = firstpass_inter_prediction(
            cpi, last_frame, golden_frame, alt_ref_frame, mb_row, mb_col,
            recon_yoffset, recon_uvoffset, src_yoffset, alt_ref_frame_yoffset,
            fp_block_size, this_intra_error, raw_motion_err_counts,
            raw_motion_err_list, &best_ref_mv, &last_mv, &stats);
        stats.coded_error += this_inter_error;
        ++raw_motion_err_counts;
      } else {
        stats.sr_coded_error += this_intra_error;
        stats.tr_coded_error += this_intra_error;
        stats.coded_error += this_intra_error;
      }

      // Adjust to the next column of MBs.
      x->plane[0].src.buf += fp_block_size_width;
      x->plane[1].src.buf += uv_mb_height;
      x->plane[2].src.buf += uv_mb_height;

      recon_yoffset += fp_block_size_width;
      src_yoffset += fp_block_size_width;
      recon_uvoffset += uv_mb_height;
      alt_ref_frame_yoffset += fp_block_size_width;
    }
    // Adjust to the next row of MBs.
    x->plane[0].src.buf += fp_block_size_height * x->plane[0].src.stride -
                           fp_block_size_width * mi_params->mb_cols;
    x->plane[1].src.buf += uv_mb_height * x->plane[1].src.stride -
                           uv_mb_height * mi_params->mb_cols;
    x->plane[2].src.buf += uv_mb_height * x->plane[1].src.stride -
                           uv_mb_height * mi_params->mb_cols;
  }
  const double raw_err_stdev =
      raw_motion_error_stdev(raw_motion_err_list, raw_motion_err_counts);
  aom_free(raw_motion_err_list);

  // Clamp the image start to rows/2. This number of rows is discarded top
  // and bottom as dead data so rows / 2 means the frame is blank.
  //将图像开始夹持到第rows/2行。这个行数在顶部和底部作为死数据丢弃，因此rows/2表示帧为空。
  if ((stats.image_data_start_row > mi_params->mb_rows / 2) ||
      (stats.image_data_start_row == INVALID_ROW)) {
    stats.image_data_start_row = mi_params->mb_rows / 2;
  }
  // Exclude any image dead zone
  if (stats.image_data_start_row > 0) {
    stats.intra_skip_count =
        AOMMAX(0, stats.intra_skip_count -
                      (stats.image_data_start_row * mi_params->mb_cols * 2));
  }

  TWO_PASS *twopass = &cpi->twopass;
  const int num_mbs = (cpi->oxcf.resize_mode != RESIZE_NONE) ? cpi->initial_mbs
                                                             : mi_params->MBs;
  stats.intra_factor = stats.intra_factor / (double)num_mbs;
  stats.brightness_factor = stats.brightness_factor / (double)num_mbs;
  FIRSTPASS_STATS *this_frame_stats = twopass->stats_buf_ctx->stats_in_end;
  update_firstpass_stats(cpi, &stats, raw_err_stdev,
                         current_frame->frame_number, ts_duration);

  // Copy the previous Last Frame back into gf buffer if the prediction is good
  // enough... but also don't allow it to lag too far.
  // 如果预测足够好，将上一帧复制回gf缓冲区...但也不要让它拖得太远... 
  if ((twopass->sr_update_lag > 3) ||
      ((current_frame->frame_number > 0) &&
       (this_frame_stats->pcnt_inter > 0.20) &&
       ((this_frame_stats->intra_error /
         DOUBLE_DIVIDE_CHECK(this_frame_stats->coded_error)) > 2.0))) {
    if (golden_frame != NULL) {
      assign_frame_buffer_p(
          &cm->ref_frame_map[get_ref_frame_map_idx(cm, GOLDEN_FRAME)],
          cm->ref_frame_map[get_ref_frame_map_idx(cm, LAST_FRAME)]);
    }
    twopass->sr_update_lag = 1;
  } else {
    ++twopass->sr_update_lag;
  }

  aom_extend_frame_borders(this_frame, num_planes);

  // The frame we just compressed now becomes the last frame.我们刚才压缩的帧现在成为最后一帧。
  assign_frame_buffer_p(
      &cm->ref_frame_map[get_ref_frame_map_idx(cm, LAST_FRAME)], cm->cur_frame);

  // Special case for the first frame. Copy into the GF buffer as a second
  // reference.第一帧的特殊情况。复制到GF缓冲区作为第二个引用。
  if (current_frame->frame_number == 0 &&
      get_ref_frame_map_idx(cm, GOLDEN_FRAME) != INVALID_IDX) {
    assign_frame_buffer_p(
        &cm->ref_frame_map[get_ref_frame_map_idx(cm, GOLDEN_FRAME)],
        cm->ref_frame_map[get_ref_frame_map_idx(cm, LAST_FRAME)]);
  }

  print_reconstruction_frame(last_frame, current_frame->frame_number,
                             /*do_print=*/0);

  ++current_frame->frame_number;
}

Aprendizaje de código AV1: función av1_first_pass

Supongo que te gusta