文章目录

简介
起因
licode中的实现
chrome中的实现
rfc中的定义

简介

在rtcp的rr（receiver report）包中有两个字段fraction lost 和cumulative number of packets lost 用来表明丢包率和总共的丢包个数。这个文章会介绍这个关于这两个值的定义以及chrome内核中的实现。

起因

事情起因是看到一个关于丢包个数统计的方式。大概的统计方式如下，拿new time（最新收到的包的时间）对应的seq number + 1减去new time - windows time（windows tiem是统计窗口时间）的seq number，计算出期望收到的包个数 expected number（期望收到的个数）。在统计new time到new time - window time这段时间内接收到包的个数计为received count（实际收到的个数）。最后将expect number - receive count 计算出丢失包的个数loss count。大概是这个样子：

下面是sequence number对应接收时间的变化。为了方便统计，时间就没有标注单位。

-> sequence number
1 2 3 4 5 [6 7 9 10 11] seq number
1 2 3 4 5 [6 7 8 9  10] 时间
-> time

假定window time是5
new time 的seq number：11
new time - windows time的seq number：6
expected number： 11 + 1 - 6 = 6， [6, 7, 8, 9, 10, 11]
receive count：5
losst count：6 - 5 = 1
到这边看起来的确是没有问题，但是一旦发生乱序就会有问题，乱序在udp包中还是很常见的事。如果乱序会有什么问题，大概是这个样子：

1 2 3 4 5 [6 7 9 11 10] seq number
1 2 3 4 5 [6 7 8 9  10] 时间

假定window time是5

new time 的seq number：11

new time - windows time的seq number：6

expected number： 10 + 1 - 6 = 5， [6, 7, 8, 9, 10]

receive count：5

loss count：5 - 5 = 0

这个时候会计算出一个错误的值。

其实如果仔细思考，这个不单单是乱序的时候有问题，在重复收到序号相同或者是序号错误的包（序号值跳变非常大）。这个时候我就思考，标准的webrtc是怎么实现的？

本文福利， C++音视频学习资料包、技术视频，内容包括（音视频开发，面试题，FFmpeg ，webRTC ，rtmp ，hls ，rtsp ，ffplay ，srs）↓↓↓↓↓↓见下面↓↓文章底部点击领取↓↓

licode中的实现

首先我去看了licode实现。为什么我会先去看licode的实现？这个就别管了，反正是有原因的，哈哈。licode中关于丢包个数的统计如下：

//接收到一个rtp包之后处理
bool RtcpRrGenerator::handleRtpPacket(std::shared_ptr<DataPacket> packet) {
  /* ......
    省略了非常多不相关的代码
    ......*/
  uint16_t seq_num = head->getSeqNumber();
  rr_info_.packets_received++; //接收包个数++，切记
  if (rr_info_.base_seq == -1) {
    rr_info_.base_seq = head->getSeqNumber(); //第一次收到rtp包，切记
  }
  if (rr_info_.max_seq == -1) { //第一次收到rtp包
    rr_info_.max_seq = seq_num; 
  } else if (!RtpUtils::sequenceNumberLessThan(seq_num, rr_info_.max_seq)) { 
    //判断条件中已经做回环处理，会用最终的seq_num和max_seq比较，进入这边就是seq_num > max_seq
    if (seq_num < rr_info_.max_seq) { //回环判断
      rr_info_.cycle++; //如果发生了回环，回环数++
    }
    rr_info_.max_seq = seq_num; //所以rr_info_.max_seq中一直记录的是最大的seq number
  }
  //因为回环的存在，计算出最终的最大的seq number，切记
  rr_info_.extended_seq = (rr_info_.cycle << 16) | rr_info_.max_seq;
  
  return false;
}

//生成rr包
std::shared_ptr<DataPacket> RtcpRrGenerator::generateReceiverReport() {
  /* ......
    省略了非常多不相关的代码
    ......*/    
  uint64_t now = ClockUtils::timePointToMs(clock_->now());//当前时间
  //期望收到的包 = 最大的seq num - 初使seq num
  uint32_t expected = rr_info_.extended_seq - rr_info_.base_seq + 1;  
  
  //这次统计期望的包 = 这次统计总共期望收到的包个数 - e
  uint32_t expected_interval = expected - rr_info_.expected_prior;
  //更新上次统计总共期望收到 = 这次统计总共期望收到
  rr_info_.expected_prior = expected; 
  //这次统计收到的包 = 这次统计总共收到的包个数 - 上次统计总共收到的包个数
  uint32_t received_interval = rr_info_.packets_received - rr_info_.received_prior; 
  //更新上次统计总共收到的包的个数 = 这次统计总共收到的包的个数
  rr_info_.received_prior = rr_info_.packets_received;
  //这次统计丢包的个数 = 这次统计期望的包 - 这次统计收到的包
  int64_t lost_interval = static_cast<int64_t>(expected_interval) - received_interval;

  // TODO(pedro): We're getting closer to packet loss without retransmissions by ignoring negative
  // lost in the interval. This is not perfect but will provide a more "monotonically increasing" behavior
  if (lost_interval > 0) {//做了一个非负数的判断，上面这段看法是作者自己的注释
    //lost加上每一次的统计丢包的个数，也就是所有丢包的个数
    rr_info_.lost += lost_interval;
  }
  //赋值给rtcp，总共的丢包个数
  rtcp_head.setLostPackets(rr_info_.lost);
  
  return (std::make_shared<DataPacket>(0, reinterpret_cast<char*>(&packet_), length, type_));
}

认真看完上一段代码和注释就可以明显的看出，在licode中每次统计，并不会用最新收到包的seq number，他会拿当前收到最大的seq number减去第一个收到的seq number得到expected number。received number就是所有收到的包。所以在licode中是有解决乱序包的问题的。但是我还有一个疑问，它并没有解决“在重复收到序号相同或者是序号错误的包”这个问题。那我们看一下chrome中是怎么做的。

chrome中的实现

//在接收到rtp包后处理
void StreamStatisticianImpl::UpdateCounters(const RtpPacketReceived& packet) {
  /* ......
    省略了非常多不相关的代码
    ......*/    
  int64_t now_ms = clock_->TimeInMilliseconds();
  //持续丢包的个数--
  --cumulative_loss_;

  //因为回环问题，计算出最终的sequence_number
  int64_t sequence_number =
      seq_unwrapper_.UnwrapWithoutUpdate(packet.SequenceNumber());

  if (!ReceivedRtpPacket()) { //如果是第一次接收到rtp包
    //记录第一个接受rtp包的值，切记
    received_seq_first_ = sequence_number;

    last_report_seq_max_ = sequence_number - 1; 
    //接收到最大的包seq number
    received_seq_max_ = sequence_number - 1; 
    receive_counters_.first_packet_time_ms = now_ms;
  } else if (UpdateOutOfOrder(packet, sequence_number, now_ms)) {
    //这边很关键，检测当前收到的包是不是一个乱序的包，具体实现在后面，如果是乱序包直接return
    return;
  }
  //顺序的包处理逻辑


  //统计丢包的个数
  //假设：sequence_number = 3， received_seq_max_ = 2， cumulative_loss_ += 1
  //因为一开始的时候cumulative_loss_--，所以抵消了。只有当真正发生丢包的时候才会加上一个值
  cumulative_loss_ += sequence_number - received_seq_max_;
  //received_seq_max_赋予成seq number最大值
  received_seq_max_ = sequence_number;
}

//检测当前收到的包是不是乱序包
bool StreamStatisticianImpl::UpdateOutOfOrder(const RtpPacketReceived& packet,
                                              int64_t sequence_number,
                                              int64_t now_ms) {
    /* ......
    省略了一点点不相关的代码
    ......*/  
  // received_seq_out_of_order_ 这个变量做了两件事：
  //1. 如果有值，标记上个包是相对于上上个包的seq number别大的包。(我就先称包叫“seq跳变包”，)
  //2. 有值，值表示上个包的seq number
  //ps：什么时候会有seq跳变包，我从注释中观察到是当stream restart的时候。
  if (received_seq_out_of_order_) {
    //持续丢包的个数--。后面逻辑可以看到，如果seq跳变包出现，暂时不会做--cumulative_loss_，会在seq跳变包后一个包做--cumulative_loss_
    --cumulative_loss_;
    //预估这个包的seq number
    uint16_t expected_sequence_number = *received_seq_out_of_order_ + 1;
    //清空这个数据
    received_seq_out_of_order_ = absl::nullopt;
    //如果这个包是seq跳变包的下一个包，确定流发生了变化，需要重新评估整个seq number了
    if (packet.SequenceNumber() == expected_sequence_number) {
      // 这段英文注释是源代码的注释
      // Ignore sequence number gap caused by stream restart for packet loss
      // calculation, by setting received_seq_max_ to the sequence number just
      // before the out-of-order seqno. This gives a net zero change of
      // `cumulative_loss_`, for the two packets interpreted as a stream reset.
      //
      // Fraction loss for the next report may get a bit off, since we don't
      // update last_report_seq_max_ and last_report_cumulative_loss_ in a
      // consistent way.

      //将sequence_number设置为跳变包前一个包
      //这样做是为了让cumulative_loss_两次 -- 操作加回去，不明白可以结合解释一起看
      received_seq_max_ = sequence_number - 2;
      //返回false，也就意味着会继续走顺序包的逻辑
      return false;
    }
  }

  //检测这个包和上一个包的seq num是否差值过大。这边就是判断这个包是否是跳变包。
  if (std::abs(sequence_number - received_seq_max_) >
      max_reordering_threshold_) {
    // Sequence number gap looks too large, wait until next packet to check
    // for a stream restart.
    // 如果差值过大，标记，并且赋值。
    received_seq_out_of_order_ = packet.SequenceNumber();
    // Postpone counting this as a received packet until we know how to update
    // `received_seq_max_`, otherwise we temporarily decrement
    // `cumulative_loss_`. The
    // ReceiveStatisticsTest.StreamRestartDoesntCountAsLoss test expects
    // `cumulative_loss_` to be unchanged by the reception of the first packet
    // after stream reset.
    //把当前包的统计恢复回去，因为在一开始的时候做了--，注释说是为了延迟统计，我也不太清楚为什么需要延迟统计
    ++cumulative_loss_;
    //返回true，不会执行后面顺序包的逻辑
    return true;
  }
  //是一个正常递增的包，直接返回false
  if (sequence_number > received_seq_max_)
    return false;
  //如果到这边了，就是一个乱序包
  return true;
}

chrome中的源码还是稍微比licode难理解一点的。

chrome中的丢包计算的逻辑是：

我收到一个包，先把cumulative_loss_减1

判断这个包是否是有序的，如果是无序（seq < max seq），那么返回。如果是有序的，继续执行

把丢包个数补上。如果是递增1，那么和刚刚的减1抵消了

所以说，chrome对乱序包还是有处理的。

对seq跳变包处理逻辑时候（如果加上cumulative_loss_会变得很混乱，所以不加cumulative_loss_的修改了）：

当前包是seq跳变包，记录seq跳变包的seq number（我们定义为unoreder seq）。

接受下一个包。判断当前的包seq number是否和unoreder seq + 1相等。如果是，表示当前seq number真的产生跳变，需要修改received_seq_max_的值，用于后续正确统计丢包个数。

如果seq number和unoreder seq + 1不相等，那么就代表seq跳变包是一个有问题的包，不会修改当前received_seq_max_值，继续以当前seq number做统计。

这块有点绕，如果不理解，可以多看两遍我上面的代码注释或者源码。

其实chrome中的实现非常严谨，但是还是没有处理rr中重复包的现象，和错误的seq跳变包的统计。是chrome写的代码有问题吗？于是我就去翻了一下rfc中关于累计丢包cumulative number of packets lost的描述。

rfc中的定义

rfc中描述如下

cumulative number of packets lost: 24 bits
      The total number of RTP data packets from source SSRC_n that have
      been lost since the beginning of reception.  This number is
      defined to be the number of packets expected less the number of
      packets actually received, where the number of packets received
      includes any which are late or duplicates.  Thus, packets that
      arrive late are not counted as lost, and the loss may be negative
      if there are duplicates.  The number of packets expected is
      defined to be the extended last sequence number received, as
      defined next, less the initial sequence number received.  This may
      be calculated as shown in Appendix A.3.

这段话中有一句很关键：This number is defined to be the number of packets expected less the number of packets actually received, where the number of packets received includes any which are late or duplicates

中文是：这个值定义是期望收到包的个数减去实际收到包的个数，实际收到包的个数包含晚到达或者重复的包。

所以cumulative number of packets lost 并不是丢包的个数，它只是一个差值。它还有可能是负数。

本文福利， C++音视频学习资料包、技术视频，内容包括（音视频开发，面试题，FFmpeg ，webRTC ，rtmp ，hls ，rtsp ，ffplay ，srs）↓↓↓↓↓↓见下面↓↓文章底部点击领取↓↓

rtcp中的持续性丢包统计

文章目录

简介

起因

licode中的实现

chrome中的实现

rfc中的定义

猜你喜欢