The role and limitations of the VideoAdapter class in webtrc

need


In the media library, it is required to dynamically change the encoding resolution and frame rate. The idea is to restart the encoder and set the new resolution and frame rate parameters of the encoder to meet the requirements. Therefore, the resolution and frame rate of the video stream input to the encoder should be consistent with the set resolution parameters.

But it cannot be achieved by changing the resolution of video capture, otherwise it may cause the camera to restart, resulting in a black image. Often, there should be a functional class for resolution and bit rate adaptation before being sent to the encoder. The input is the original resolution and frame rate of video capture, and the output is the encoding resolution and frame rate that satisfy the encoder.

VideoAdapter class

Resolution Adaptation

The class in webrtc VideoAdapteris to realize the function here, the core function that determines the scaling ratioFindScale

Fraction FindScale(int input_width,
                   int input_height,
                   int target_pixels,
                   int max_pixels,
                   bool variable_start_scale_factor) {
  // This function only makes sense for a positive target.
  RTC_DCHECK_GT(target_pixels, 0);
  RTC_DCHECK_GT(max_pixels, 0);
  RTC_DCHECK_GE(max_pixels, target_pixels);

  const int input_pixels = input_width * input_height;

  // Don't scale up original.
  if (target_pixels >= input_pixels)
    return Fraction{1, 1};

  Fraction current_scale = Fraction{1, 1};
  Fraction best_scale = Fraction{1, 1};

  if (variable_start_scale_factor) {
    // Start scaling down by 2/3 depending on |input_width| and |input_height|.
    if (input_width % 3 == 0 && input_height % 3 == 0) {
      // 2/3 (then alternates 3/4, 2/3, 3/4,...).
      current_scale = Fraction{6, 6};
    }
    if (input_width % 9 == 0 && input_height % 9 == 0) {
      // 2/3, 2/3 (then alternates 3/4, 2/3, 3/4,...).
      current_scale = Fraction{36, 36};
    }
  }

  // The minimum (absolute) difference between the number of output pixels and
  // the target pixel count.
  int min_pixel_diff = std::numeric_limits<int>::max();
  if (input_pixels <= max_pixels) {
    // Start condition for 1/1 case, if it is less than max.
    min_pixel_diff = std::abs(input_pixels - target_pixels);
  }
  
  //720p为16:9,宽高各自缩放对应的比率,宽高比率还是16:9
  //宽,高各自按比例计算,算法会依次取3/4,1/2,3/8,1/4,3/16,1/8进行计算,选择一个适合的分辨率
  // Alternately scale down by 3/4 and 2/3. This results in fractions which are
  // effectively scalable. For instance, starting at 1280x720 will result in
  // the series (3/4) => 960x540, (1/2) => 640x360, (3/8) => 480x270,
  // (1/4) => 320x180, (3/16) => 240x125, (1/8) => 160x90.
  while (current_scale.scale_pixel_count(input_pixels) > target_pixels) {
    if (current_scale.numerator % 3 == 0 &&
        current_scale.denominator % 2 == 0) {
      // Multiply by 2/3.乘以 2/3
      current_scale.numerator /= 3;
      current_scale.denominator /= 2;
    } else {
      // Multiply by 3/4.乘以 3/4
      current_scale.numerator *= 3;
      current_scale.denominator *= 4;
    }
    
    //根本宽,高的比例计算像素
    int output_pixels = current_scale.scale_pixel_count(input_pixels);
    if (output_pixels <= max_pixels) {
      int diff = std::abs(target_pixels - output_pixels);
      if (diff < min_pixel_diff) {
        min_pixel_diff = diff;
        best_scale = current_scale;
      }
    }
  }
  best_scale.DivideByGcd();

  return best_scale;
}
  • Auxiliary class Fractionrepresenting fractions
struct Fraction {
  //分子
  int numerator;
  //分母
  int denominator;

  void DivideByGcd() {
    //获取最大公约数
    int g = cricket::GreatestCommonDivisor(numerator, denominator);
    numerator /= g;
    denominator /= g;
  }

  // Determines number of output pixels if both width and height of an input of
  // |input_pixels| pixels is scaled with the fraction numerator / denominator.
  int scale_pixel_count(int input_pixels) {
    //宽,高各自按比例计算,计算总像素数
    return (numerator * numerator * input_pixels) / (denominator * denominator);
  }
};
  • Calculate the greatest common divisor
int GreatestCommonDivisor(int a, int b) {
  RTC_DCHECK_GE(a, 0);
  RTC_DCHECK_GT(b, 0);
  int c = a % b;
  while (c != 0) {
    a = b;
    b = c;
    c = a % b;
  }
  return b;
}

VideoAdapterZooming in is not supported, only zooming out. The main function FindScaleis realized in , and its function is to targer_pixel_countdetermine the optimal ratio according to the following two points:

  1. Width and height are reduced proportionally
  2. Does not change the original aspect ratio, such as 720P (1280*720,16:9), is scaled according to the aspect ratio of 16:9

frame rate adaptation

The frame rate can only be adapted from large to small, and the core function is in VideoAdpaterthe class KeepFramemethod. The implementation idea is relatively simple. Calculate the time interval according to the target frame rate, and then adopt a frame loss strategy according to the time interval to achieve the target frame rate.

limitations

The dynamic change of resolution in webrtc is an internal decision in webrtc. What may affect the resolution change is the network environment, machine performance, etc., which is also a black box for external businesses. And the change of the resolution keeps the aspect ratio. For example, if the collected resolution is 720P (16:9), it needs to be changed to VGA (640 * 480->4:3). In webrtc, it is scaled to solve this resolution. ratio, the aspect ratio is 16:9, which is 640*360.

Many scenarios and businesses require the media library to support dynamic resolution changes. For example, if 720P, VGA, and CIF are supported and can be converted to each other, obviously their aspect ratios are inconsistent, and the existing logic cannot meet this requirement VideoAdpater. requirements. Therefore, the media library will VideoAapterbe modified so that it can support the enlargement and reduction of the specified resolution.

Guess you like

Origin blog.csdn.net/mo4776/article/details/122433236