H.266/VVC code learning: predIntraMip function of MIP technology related code

The predIntraMip function is the entry function for MIP prediction. The main function is to perform matrix multiplication, and then obtain the predicted pixels of the entire block through upsampling. The implementation steps are shown in the following figure:

The predIntraMip code and comments are as follows

#if JVET_R0350_MIP_CHROMA_444_SINGLETREE
void MatrixIntraPrediction::predBlock(int *const result, const int modeIdx, const bool transpose, const int bitDepth,
                                      const ComponentID compId)
{
  CHECK(m_component != compId, "Boundary has not been prepared for this component.");
#else
void MatrixIntraPrediction::predBlock(int* const result, const int modeIdx, const bool transpose, const int bitDepth)
{
#endif
  //是否需要上采样
  const bool needUpsampling = ( m_upsmpFactorHor > 1 ) || ( m_upsmpFactorVer > 1 );

  //根据mipSizeId获取MIP矩阵
  const uint8_t* matrix = getMatrixData(modeIdx);
  //存储缩减预测像素
  static_vector<int, MIP_MAX_REDUCED_OUTPUT_SAMPLES> bufReducedPred( m_reducedPredSize * m_reducedPredSize );
  int* const       reducedPred     = needUpsampling ? bufReducedPred.data() : result;
  //根据是否转置获得缩减边界像素向量
  const int* const reducedBoundary = transpose ? m_reducedBoundaryTransposed.data() : m_reducedBoundary.data();
  //进行矩阵乘法计算缩减预测像素
  computeReducedPred(reducedPred, reducedBoundary, matrix, transpose, bitDepth);
  //如果需要进行上采样
  if( needUpsampling )
  {
    //上采样函数，利用缩减预测像素获得整个块的预测像素
    predictionUpsampling( result, reducedPred );
  }
}

	Block size	Boundary length after downsampling m_reducedBdrySize	Matrix multiplication output boundary length m_reducedPredSize	Number of MIP matrix	MIP matrix dimensions
mipSizeId = 0	4x4	2	4	16	16x4
mipSizeId = 1	4xN, Nx4, 8x8	4	4	8	16x8
mipSizeId = 2	Rest block	4	8	6	64x8

1. Get the MIP matrix

The acquisition of MIP matrix is related to mipSizeId. As shown in the above table, the number and dimension of MIP matrix corresponding to different mipSizeId are different

2. Calculate predicted pixels

The computeReducedPred function calculates the predicted pixels after downsampling by matrix multiplication, as shown in the figure below. mode k represents the mode number $A_{k}$ of the MIP matrix, and represents the MIP matrix corresponding to mode k, which is obtained by calling the getMatrixData function. $b_{k}$ Corresponding to the offset vector, the calculation method is as follows, where p[i] represents the matrix multiplication input vector, and inSize represents 2 * m_reducedBdrySize.

The formula for MIP matrix multiplication is shown below, where mWeight represents the MIP matrix and p represents the input vector of the MIP matrix multiplication.

void MatrixIntraPrediction::computeReducedPred( int*const result, const int* const input,
                                                const uint8_t* matrix,
                                                const bool transpose, const int bitDepth )
{
  const int inputSize = 2 * m_reducedBdrySize; // 4 or 8

  // use local buffer for transposed result 对转置结果使用本地缓冲区
  static_vector<int, MIP_MAX_REDUCED_OUTPUT_SAMPLES> resBufTransposed( m_reducedPredSize * m_reducedPredSize );
  int*const resPtr = (transpose) ? resBufTransposed.data() : result;

  int sum = 0;
  for( int i = 0; i < inputSize; i++ ) { sum += input[i]; }
  // MIP_SHIFT_MATRIX 移位因子sW固定为6
  // MIP_OFFSET_MATRIX 偏移因子fO固定为32
  // 计算偏移量Bias
  const int offset = (1 << (MIP_SHIFT_MATRIX - 1)) - MIP_OFFSET_MATRIX * sum;
  CHECK( inputSize != 4 * (inputSize >> 2), "Error, input size not divisible by four" );

  const uint8_t *weight = matrix; //权重矩阵
  // 获取input[0]，即m_reducedBoundary[0]
  const int   inputOffset = transpose ? m_inputOffsetTransp : m_inputOffset;

  const bool redSize = (m_sizeId == 2);
  int posRes = 0;
  for( int y = 0; y < m_reducedPredSize; y++ )
  {
    for( int x = 0; x < m_reducedPredSize; x++ )
    {
      if( redSize ) weight -= 1;
      int tmp0 = redSize ? 0 : (input[0] * weight[0]);
      int tmp1 = input[1] * weight[1];
      int tmp2 = input[2] * weight[2];
      int tmp3 = input[3] * weight[3];
      for (int i = 4; i < inputSize; i += 4)
      {
        tmp0 += input[i]     * weight[i];
        tmp1 += input[i + 1] * weight[i + 1];
        tmp2 += input[i + 2] * weight[i + 2];
        tmp3 += input[i + 3] * weight[i + 3];
      }
      //对矩阵乘法输出采样钳位
      resPtr[posRes++] = ClipBD<int>(((tmp0 + tmp1 + tmp2 + tmp3 + offset) >> MIP_SHIFT_MATRIX) + inputOffset, bitDepth);

      weight += inputSize;
    }
  }

  if( transpose )
  {
    // 将矩阵乘法结果进行转置
    for( int y = 0; y < m_reducedPredSize; y++ )
    {
      for( int x = 0; x < m_reducedPredSize; x++ )
      {
        result[ y * m_reducedPredSize + x ] = resPtr[ x * m_reducedPredSize + y ];
      }
    }
  }
}

3. Upsampling

The interpolation order is fixed. If you need horizontal interpolation, perform horizontal interpolation first, then vertical interpolation, as shown in the figure below (take an 8x8 block as an example). The up-sampling process is actually a linear weighting process. By linearly weighting the reference pixel and the predicted pixel at the corresponding position, the pixel value in the blank space can be obtained (the weight is related to the position).

The relevant code and comments are as follows:

// dst 上采样结果
// src 矩阵乘法输出结果
void MatrixIntraPrediction::predictionUpsampling( int* const dst, const int* const src ) const
{
  const int* verSrc     = src;
  SizeType   verSrcStep = m_blockSize.width;
  //插值过程固定，先水平后垂直
  if( m_upsmpFactorHor > 1 ) //如果需要进行水平插值
  {
    int* const horDst = dst + (m_upsmpFactorVer - 1) * m_blockSize.width;
    verSrc = horDst;
    verSrcStep *= m_upsmpFactorVer;

    predictionUpsampling1D( horDst, src, m_refSamplesLeft.data(),
                            m_reducedPredSize, m_reducedPredSize,
                            1, m_reducedPredSize, 1, verSrcStep,
                            m_upsmpFactorVer, m_upsmpFactorHor );
  }

  if( m_upsmpFactorVer > 1 )
  {
    predictionUpsampling1D( dst, verSrc, m_refSamplesTop.data(),
                            m_reducedPredSize, m_blockSize.width,
                            verSrcStep, 1, m_blockSize.width, 1,
                            1, m_upsmpFactorVer );
  }
}

The specific implementation of the interpolation code (take 8x8 as an example):

Horizontal interpolation: The order of horizontal interpolation in the code is from top to bottom, that is, the order of interpolation is the second, fourth, sixth, and eighth line

Vertical interpolation: The order of vertical interpolation in the code is from left to right, that is, the order of interpolation is column 1 2 3 4 5 6 7 8.

In the code, the predicted pixels are placed in the result block during the interpolation process.

/*
- dst:上采样结果
- srt:矩阵乘法输入结果或者水平插值结果
- bndry:边界参考像素
- bndryStep:插值时参考边界像素的间隔(有时候不一定会参考全部的边界像素)
- srcSizeUpsmpDim: m_reducedPredSize(4/8)
- srcSizeOrthDim:当前插值方向需要插值的次数
- upsmpFactor:采样因子
*/
void MatrixIntraPrediction::predictionUpsampling1D(int* const dst, const int* const src, const int* const bndry,
                                                   const SizeType srcSizeUpsmpDim, const SizeType srcSizeOrthDim,
                                                   const SizeType srcStep, const SizeType srcStride,
                                                   const SizeType dstStep, const SizeType dstStride,
                                                   const SizeType bndryStep,
                                                   const unsigned int upsmpFactor )
{
  const int log2UpsmpFactor = floorLog2( upsmpFactor );
  CHECKD( upsmpFactor <= 1, "Upsampling factor must be at least 2." );
  const int roundingOffset = 1 << (log2UpsmpFactor - 1);

  SizeType idxOrthDim = 0;
  const int* srcLine = src;//矩阵乘法输出或水平插值结果
  int* dstLine = dst;
  const int* bndryLine = bndry + bndryStep - 1;//边界参考像素
  while( idxOrthDim < srcSizeOrthDim )
  {
    SizeType idxUpsmpDim = 0;
    const int* before = bndryLine;//前一个参考像素
    const int* behind = srcLine;//后一个参考像素
    int* currDst = dstLine;
    while( idxUpsmpDim < srcSizeUpsmpDim )
    {
      SizeType pos = 1;//控制当前插值的位置，将插值结果和矩阵乘法结果放到各自相应的位置上
      int scaledBefore = ( *before ) << log2UpsmpFactor;
      int scaledBehind = 0;
      while( pos <= upsmpFactor )
      {
        //通过+-操作可以控制插值时参考像素的权重
        scaledBefore -= *before;
        scaledBehind += *behind;
        *currDst = (scaledBefore + scaledBehind + roundingOffset) >> log2UpsmpFactor;

        pos++;
        currDst += dstStep;
      }

      idxUpsmpDim++;
      before = behind;//移动前一个参考像素
      behind += srcStep;//移动后一个参考像素
    }

    idxOrthDim++;
    srcLine += srcStride;
    dstLine += dstStride;
    bndryLine += bndryStep;
  }
}