Detailed explanation of RTCP protocol (SR, RR, SDES, BYE, APP, NACK, TCC, PLI, SLI, FIR)

Five types of RTCP packets are defined in the RTCP protocol specification : receiver report ( RR ), sender report ( SR ), source
Description ( SDES ), Membership Management ( BYE ) and Application Definition ( APP ).
SR: payload type=200
RR:payload type=201
SDES: payload type=202
BYE:payload type=203
APP:payload type=204
RTPFB:payload type=205

PSFB:payload type=206

RTCP_RTP_FB_NACK_FMT(1): NACK retransmission, type-205

RTCP_RTP_FB_RTX_FMT(1): RTX retransmission, type-205

RTCP_RTP_FB_CC_FMT(15): Transport-cc bandwidth estimation, type-205

RTCP_PLI_FMT(1): picture retransmission, type-206

RTCP_SLI_FMT(2): Slice retransmission, type-206

RTCP_FIR_FMT(4): key frame retransmission, type-206

RTCP_REMB_FMT(15): bandwidth estimation, type-206

Version number (V) : For the current version of the RTP protocol, the version number is 2 (as of the writing of this book), there is currently no plan to launch a new version, and the previous version is not widely used.

Padding (P) : The padding bit indicates that the data to be filled has exceeded the current number of bits that can be accommodated. If this bit is set to 1, it means that the end of the packet has been filled with one or more octets, and the content of the last octet indicates the total size of this packet.

Item Count (IC) : Some packet types contain a list of items, possibly in addition to fixed, type-specific information. These entry fields need to indicate the total number of entries contained in the package (this field is named differently in different packages, depending on how the field is used). Each RTCP packet contains up to 31 entries, and is also limited by the MTU (maximum transmission unit). Applications must generate multiple RTCP packets if scenarios requiring transmission of more than 31 entries are required. When the Item Count field is 0, it means that the items in this package are empty (but it does not mean that the content in the package is empty). If the Item count field is not needed then this field can be used for other purposes.

Packet Type (PT) : This field identifies the type of information carried in the transmitted packet. Five standard packet types are defined in the RTP specification, and others may be defined in the future (for example, to report additional statistics or to convey other source-specific information).

Length : This field identifies the total length of the content after the header. Because the length of all RTCP packets must be an integer multiple of 32 bits, this field contains the number of 32-bit words , because if calculated by octet, this field will be inconsistent with the total length. 0 is a valid length, indicating that the packet contains only 4 octets of the header (the header field IC is also 0 in this case).

The structure of the compound packet in the RTCP packet

The format of RR in RTCP :

 Each report block (report block) describes the reception quality of a single synchronization source, and the reporter (reporter) receives the RTP packets sent from the synchronization source during the current report interval. Each RTCP RR packet has a total of 31 report blocks. If there are more than 31 active senders, then the receiver SHOULD send multiple RR packets in one compound packet, with 7 fields per report block , for a total of 24 bytes.

Reportee (Reportee) The SSRC identifies the participant to which this reporting block is related. The statistical data in the report block indicates the reception quality of the data packets of the synchronization source received by the reporting party at the participant that generated the RR data packets.

The loss fraction is defined as the number of packets lost during this reporting interval, divided by the number expected to arrive. The packet loss rate is expressed as a fixed-point number with the binary point at the left edge of the field. That is, the integer part of the packet loss rate multiplied by 256 (that is, if 1/4 of the packets are lost during transmission, then the packet loss rate should be 1/4 * 256 = 64). If the number of received packets is greater than expected (due to There are duplicate packets), so that the number of packet loss is negative, then the part of packet loss is set to 0.

Cumulative packet loss is a 24-bit signed integer representing the number of packets that were expected to arrive, minus the number of packets actually received . The expected number of packets is defined as the last received extended sequence number, minus the received initial sequence number. The total number of packets received includes any late arrivals or retransmissions, so may be larger than expected, so the cumulative number of lost packets may be negative . The calculation interval of the accumulated packet loss counts during the entire session, not during each interval. If the total number of packets lost during the session is greater than 0x7FFFFF, then this field will be at its maximum saturation value at 0x7FFFFF.

Theoretical calculation method, packet lost = expected number of packets - actual number of received packets.

The actual calculation method, packet lost = expect to receive the latest sequence - the sequence of the first received packet.

The calculation of the extended highest sequence number (extended highest sequence number) received in the RTP data packet of the synchronization source is not necessarily the extended sequence number of the last RTP packet received due to possible packet reordering. The extended sequence number is calculated based on the session, not the packet interval.

extended_seq_num = seq_num + (65536 * wrap_around_count)

Where wrap_around_count is the number of sequence flips

Interarrival jitter (Interarrival jitter) is an estimate of the statistical variance of the network transmission time of the data packets sent by the reporter (Reportee) synchronization source. It is measured in timestamp units, so it is represented as a 32-bit unsigned integer like an RTP timestamp.

 J(i) = J(i-1) + (|D(i-1,i)| - J(i-1))/16

D(i,j) = (Rj - Ri) - (Sj - Si) = (Rj - Sj) - (Ri - Si)

I And Ri D(i, i-1) J(i)
1 0 10 0 0
2 20 30 0 0
3 40 49 -1 0.0625
4 60 74 5 0.3711

The last sender report (LSR) timestamp is the middle 32 bits of a timestamp in 64-bit NTP (Network Time Protocol) format, contained in the most recent RTCP received from the reporter's SSRC in the SR package. If SR is not received then this field can be set to 0.

The delay since last sender report (DSLR) is the delay between the last SR packet received by the reported SSRC and the sending of this received report block, in units of 1/65,536 seconds . If no SR was received from this reporter, the DLSR field is set to 0.

The sender can use the LSR and DLSR fields to calculate the round-trip time (rtt) between it and each receiver. When a related RR packet is received, the sender subtracts the LSR field from the current time to obtain the delay between sending the SR and receiving the RR. The sender then subtracts the DLSR field to remove the offset from the receiver's latency to obtain the network round-trip time.

RTT=NTP-LSR-DLSR.

Format of SR packet in RTCP

The packet type of the sender report is 200, and the payload contains a 24-byte sender information block, followed by 0 or more receiver report blocks, identified by the RC field, similar to the receiver report. Receiver report blocks appear when the sender is also the receiver.

The NTP timestamp is a 64-bit unsigned value representing the time when this RTCP SR packet was sent. Its format is NTP timestamp, the time starts from January 1, 1900 to count seconds, and the lower 32 bits represent fractions of seconds (fractions of second)

(That is, a 64-bit fixed-point value with the binary point after 32 bits). To convert a UNIX timestamp (seconds since January 1, 1970) to NTP time, add 2,208,988,800 seconds.

The corresponding time of the RTP timestamp is the same as that of the NTP timestamp, however, it is expressed in the base unit of the RTP media clock. This value, usually differs from the previous packet's RTP timestamp because some time has passed since the data in this packet was sampled.

The sender's packet count is the total number of packets generated since the start of this sync session. The sender's byte count is the number of bytes contained in the payload (not including headers or padding) of these packets. If the sender changes its SSRC (for example, due to a collision), the sender's packet count and byte count fields are reset.

RTCP SDES : Source Description ( Source Description )  

RTCP can also be used to deliver Source Description (SDES) packets, providing participant authentication and supplementary details such as location, email address and phone number.

It is possible for an application to generate a packet with an empty list of SDES items. In this scenario, the SC and length fields in the RTCP public header are both 0. Under normal circumstances, SC should be 1 (mixers and converters) (Translators) The packets generated by the aggregated forwarding information will have a larger list of SDES items).

 

The entries in each SDES entry are packed into the pack in a contiguous manner, without separation or padding. The list of items (list of item) ends with one or more empty bytes. When the first byte is parsed to be of type 0, it means the end of the list. The 0 type byte will not be followed by the length byte, but if padding is required, additional null bytes are included until a 32-bit boundary is reached. This padding is separate from the padding indicated by the P bit in the RTCP header. A list with zero entries (four null bytes) is valid, but meaningless.

The CNAME entry (type=1) provides each participant with a canonical name (CNAME). It provides a stable and persistent identifier independent of the synchronization source (since the SSRC will change if the application is restarted or an SSRC conflict occurs). CNAME can be used to correlate multiple media streams of participants from different RTP sessions (for example, correlating voice and video that need to be synchronized), and to name participants when media tools are restarted. This is the only mandatory SDES entry, all implementations are required to send the SDES CNAME entry.

RTCP release connection (bye): member control

RTCP provides members with loose control through RTCP's Bye package. If Bye is received, it indicates that some participants have left the session. Bye packets are generated when a participant leaves the session, or when another participant changes the SSRC due to a conflict. Bye packets may be lost during transmission, and some applications cannot generate this packet. Therefore, even if the Bye packet is not received, the receiver should have a timeout mechanism for participants who have not been active for a period of time.

An RTCP BYE packet does not terminate any other association between participants. The identifier type of the Bye packet is 203, and the RC field in the common RTCP header indicates the number of SSRC identifiers. The possibility of its existence is 0, and it is useless when the flag is 0. Implementations SHOULD assume that the source has left the session when a Bye packet is received, and ignore any subsequent RTP and RTCP packets from this source. Most importantly, after receiving the Bye packet, it is necessary to keep the connection state for this participant for a period of time, because it is necessary to allow delayed data packets to be received.

 The Bye packet may also contain text indicating the reason for leaving the session, suitable for display in the user interface. However, this text is optional, and we need to accept it during the implementation (even though the text may be ignored).

RTCP APP : Application-defined RTCP packets 

The last category of RTCP packages (APP) allows applications to define their own extensions. Its package type is 204, which is a unique identifier consisting of 4 characters. Each character must be selected from the ASCII character set and is case-sensitive. It is recommended to choose the package name to match the application it represents, and let the application coordinate the choice of subtype value. The remainder of the package is used for the specific needs of the application.

Application-defined packages for non-standard extensions to RTCP and new features for authentication. The intent is that verifiers first use APP to verify new features, and then register as new package types if the new features have widespread use. Some application-generated packages or implementations should ignore unrecognized application packages.

Packing problem _ _

RTCP packets are not sent individually, but are grouped into a composite data packet for transmission. The participant that generates the composite RTCP packet is an active data sender, then the composite packet must   start with an RTCP SR packet . Otherwise MUST   start with an RTCP RR packet . This is true even if no data has been sent or received, in which case the SR/RR packet will not contain the receiver's report block (header field RC is 0). On the other hand, if data is received from multiple sources and there are too many reports to fit into one SR/RR packet, the composited data should begin with one SR/RR packet followed by multiple RR packets. Following the SR/RR packet is an SDES packet. This package must contain a CNAME entry. It may contain other entries. How often other (non-CNAME) SDES entries are included is determined by the RTP profile in use.

The Bye packet must be sent as the last packet. Additional RTCP packets to be sent can be in any order. These strict ordering rules are intended to make packet verification easier, since misdirected packets will most likely not satisfy these constraints.

A potential problem when generating compound RTCP packets is how to handle sessions with a large number of active senders. If there are more than 31 active senders, it is necessary to add additional RR packets in the composite packet. This process can be repeated as needed until the upper limit of the MTU is reached. If there are so many senders that the receiver reports cannot be accommodated by the MTU, then some of the sender's reception reports MUST be ignored. If this is the case, the ignored report should be included in the next composite packet generated (requiring the receiver to trace the source of the report in each interval).

Sometimes it is necessary to pad a composite RTCP packet beyond its original size. In this scenario, the padding is just added to the last RTCP packet in the compound packet, and the P bit (P bit) is set in the last packet.

Retransmit NACK (RTPFB-FMT(1))

Retransmission requests require two steps: the packet format needs to be defined for retransmission requests, and the timing rules must be modified to allow immediate feedback.

The format of a negative acknowledgment (NACK) is shown in Figure 9.11. NACK contains a packet identifier indicating packet loss and a bitmap showing which of the following 16 packets was lost, with a value of 1 indicating loss.

Feedback packets are sent as part of a compound RTCP packet in the same way as all other RTCP packets. They are placed at
the end of the composite package, after the SR/RR and SDES items.

RTX retransmission

PT=205, FMT=1 is the same as NACK

1. The embodiment of rtx in sdp:

a=rtpmap:97 rtx/90000
a=fmtp:97 apt=96

a=ssrc-group:FID ssrc rtxssrc

97 is the rtx payload type, and 90000 is the clock frequency, which is generally the same as the clock frequency of the packet to be retransmitted.

a=fmtp:97 apt=96 means that the rtx packet with payload type 97 is retransmitted with payload type 96.

a=ssrc-group:FID ssrc rtxssrc associates the ssrc of the rtx packet with the ssrc of the retransmission packet.

In addition, the 97 rtx payload type must appear in the corresponding m line of sdp. 

Original Rtp Packet Payload is the payload data of the packet to be retransmitted, OSN is the sequence number of the original packet, and the Rtp Header is the same as the original packet except for Ssrc, payload type and sequence number of rtx.

Rtx principle: The retransmitted packet is encapsulated into the RTX packet and sent. The RTX packet has a different SSRC and different rtpseq from the original RTP, but the timestamp is the same as the timestamp of the lost packet.

Rtx advantage: rtp retransmission packets are not included in the calculation of bandwidth estimation, it is more convenient to use rtx, and the statistical packet loss rate without rtx sometimes has a negative value

Rtxpayload: The first two bytes represent the rtp seq of the lost packet, so the rtx packet is 2 bytes more than the lost rtp packet

Temporary maximum bit rate request TMMBR(RTPFB-FMT(3))

Deprecated in webRTC

Temporary maximum bit rate notification TMMBR(RTPFB-FMT(4))

Response to TMMBR

Deprecated in webRTC

PLI(PSFB-FMT(1))

FCI is empty

The PLI message is used by the decoding end to notify the encoding end that the encoded data of the image I want to decode is lost. For the video encoding type based on inter-frame prediction, the encoder will know that the video data is lost after receiving the PLI message. Since the inter-frame prediction needs to be based on the complete video frames before and after it can be decoded (for example, in H264, there are B frames, you need to refer to the front and back frames can only be decoded), the previous data is lost, and the following video frames cannot be decoded normally to produce an image. At this time, the encoding end can directly generate a key frame and send it to the decoding end.

SLI(PSFB-FMT(2))

。。。 

FIR (PSFB-FMT(4))

 When the decoding end needs to be refreshed, it can send an FIR message to the encoding end, and the encoding end sends a key frame at this time to refresh the decoding end. This is somewhat similar to the PLI message, but the PLI message is used for notification of packet loss, but FIR is not. In some non-packet loss situations, FIR will be used. Give two examples:
1) When the decoder needs to switch to another different video, because new decoding parameters are needed, it can send an FIR message to notify the encoder to generate a key frame, obtain new decoding parameters, and refresh the video decoder;
2) In a video conference, new users join at random times, and the videos sent by each encoding end are not necessarily all key frames, so new users may not be able to decode normally. At this time, the newly joined user sends an FIR message, notifying each encoding end to send it a key frame, and can decode normally after obtaining the key frame.

REMB(PSFB-FMT(15)) 

It describes an absolute value timestamp option for bandwidth estimation. This feedback message is used to inform a sender of multiple media streams on the same RTP session, to inform it of the total estimated available bit rate on the path of the receiver of the RTP session.

In the Common Packet Header for Feedback messages (as defined in Section 6.1 of [RFC4585]), the "SSRC of Packet Sender" field indicates the source of the notification. "SSRC for Media Source" is not used and should be set to 0. A value of zero is also used in other RFCs.

Receipt by a media sender of a REMB message conforming to this specification will result in the message being sent over the RTP session at an aggregate bit rate equal to or lower than the bit rate in this message. The new bitrate limit should be applied as soon as possible. Senders are free to apply other bandwidth limits according to their own constraints and estimates.

How to implement REMB?

  1. The SDP contains the following attributes
a=rtcp-fb:<payload type> goog-remb
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time

 V=2 P=0 FMT=15 PT=206  SSRC of media source=0 

unique identifier ='R' 'E' 'M' 'B'

Number of synchronization sources NUM SSRC:

Bandwidth exponent BR EXP:

Bandwidth base BR Mantissa:

SSRC feedback (32 bits) Consists of one or more SSRC entries which this feedback message applies to.

receiver-bit-rate = mantissa * 2^exp

example:

Transport-CC(RTPFB-FMT(15))

Transport-cc refers to Transport-wide Congestion Control. WebRTC's latest congestion control algorithm (Sendside BWE) is based on Transport-cc. The receiving end records the arrival time of the data packet, constructs the relevant RTCP packet, and then feeds it back to the sending end, and estimates the bandwidth at the sending end to perform congestion control. In order to use in WebRTC Transport-cc requires the use of RTP header extensions and the addition of new RTCP types. Here we introduce RTP and RTCP in Transport-cc.

message format

In Transport-cc, the stream receiving client feeds back the arrival time information of each RTP packet received to the sender through TransportFeedback RTCP. First, let's look at the TransportFeedback package format definition

  • base sequence number: 2 bytes, the first RTP packet recorded in the TransportFeedback packet transport sequence number. In each TransportFeedback RTCP packet fed back, this field is not necessarily incremental, and may be smaller than the previous RTCP packet
  • packet status count: 2 bytes, indicating how many RTP packets are recorded in this TransportFeedback packet. These RTPs are used transport sequence numberasbase sequence number a benchmark. For example, if the first RTP packet recorded is transport sequence number, base sequence numberthen the second RTP packet recorded transport sequence numberisbase sequence number+1
  • reference time: 3 bytes, indicating the reference time, with 64ms as the unit, and the arrival time information of the RTP packet recorded by the RTCP packet is reference timecalculated based on this
  • feedback packet count: 1 byte, used to count each TransportFeedback packet sent, which is equivalent to the serial number of the RTCP packet. Can be used to detect packet loss of TransportFeedback packets
  • packet chunk: 2 bytes, record the arrival status of RTP packets, and the recorded RTP packets are obtained transport sequence numberby base sequence numbercalculation
  • recv delta: 8bits, for the packet in the "packet received" state, that is, the received RTP packet, recv deltaadd the corresponding arrival time interval information in the list to record the arrival time information of the RTP packet. Through the previous reference timeand recv deltainformation, we can get the arrival time of the RTP packet

packet chunk

  • First of all, understand the status of the RTP packet. At present, the following four states are defined. Each status value is 2 bits, which is used to identify the arrival status of the RTP packet and the time interval information with the previous RTP packet:

  • 00-Packet not received
  • 01-Packet received, small delta
  • 10-Packet received, large or negative delta
  • 11-[Reserved]

There are two types of packet chunks, Run length chunk (run length encoded data block) and Status vector chunk (state vector encoded data block), corresponding to the two encoding methods of the packet chunk structure. The first bit of the packet chunk identifies the chunk type.

Let's first understand the Run length (travel length) encoding. Run length encoding is a simple data compression algorithm. Its basic idea is to describe characters that repeat and appear multiple times in a row using "number of consecutive occurrences + character". For example: aaabbbcdddd can be compressed into 3a3bc4d through Run length encoding. In the Run length chunk, Run length encoding is used to identify multiple consecutive packets of the same state.

The first bit of the Run length chunk is 0, followed by packet status and run length. The format is as follows:

       0                   1

       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      |T| S |       Run Length        |

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

hunk type (T):1 bit, the value is 0
packet status symbol (S):2 bits, identifies the status of the packet
run length (L): 13 bits, the length of the trip, identifies how many consecutive packets are in the same state

The following example illustrates.

       0                   1

       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      |0|0 0|0 0 0 0 0 1 1 0 1 1 1 0 1|

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The packet status is 00, which is "Packet not received" from the previous packet status, and the run lengh is 221 (11011101), indicating that there are 221 consecutive packets in the "Packet not received" state.

Status Vector Chunk

The first bit is 1, followed by symbol size and symbol list. The format is as follows:

        0                   1

        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       |T|S|       symbol list         |

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

  • chunk type (T):1 bit, the value is 1
  • symbol size(S):1 bit, 0 indicates that only "packet not received" (0) and "packet received" (1) states are included, and each state is represented by 1 bit, so that the subsequent 14-bit symbol list can identify the states of 14 packets. 1 indicates that 2 bits are used to identify the package status, so we can only identify the status of 7 packages in the symbol list
  • symbol list: 14 bits, identifying the status of a series of packages, can identify the status of 7 or 14 packages in total

2

3

4

5

        0                   1

        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       |1|0|0 1 1 1 1 1 0 0 0 1 1 1 0 0|

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The symbol size is 0, which can identify the status of 14 packets. The status of the first packet is "packet not received" (0), then the status of the next 5 packets is "packet received" (1), the status of the next three packets is "packet not received", and the status of the next three packets is "packet received", and the last two packets have a status of "packet not received".

1

2

3

4

5

        0                   1

        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       |1|1|0 0 1 1 0 1 0 1 0 1 0 0 0 0|

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The symbol size is 1, which can only identify the status of 7 packages. The first packet is in the "packet not received" (00) state, the second packet is in the "packet received, w/o timestamp" (11) state, and then the next three packets are in the "packet received" (01) state, and finally Two packets are in "packet not received" (00) status.

Receive Delta

With 250us (0.25ms) as the unit, it indicates the interval between the arrival time of the RTP packet and the arrival time of the previous RTP packet. For the first recorded RTP packet, the time interval of the packet is relative to the reference time.

  • If packet chunka packet in the "Packet received, small delta" state is recorded, receive deltaan unsigned 1-byte length receive delta will be added to the list, and the unsigned 1-byte value range is [0,255]. Since Receive Delta is 0.25 The unit is ms, so the value range of Receive Delta at this time is [0, 63.75]ms
  • If packet chunka packet with "Packet received, large or negative delta" status is recorded, then receive deltaa signed 2-byte receive delta with a length of [-8192.0, 8191.75] ms will be added to the list
  • If the time interval exceeds the maximum limit, a new TransportFeedback RTCP packet will be constructed. Since reference timethe length is 3 bytes, the current 3-byte length of the packet can cover a large range

The above description is summarized as follows: for the received RTP packet, the arrival time is recorded in the TransportFeedback RTCP receive delta list through the time interval. If the time interval with the previous packet is small, then use 1 byte to represent it, otherwise 2 bytes, exceeding the maximum value range, start a new RTCP package.

For packets in the "Packet received, small delta" state, receive deltathe maximum value is 63.75ms, so a time span of one second can identify at least 1000/63.75~=16 packets. Since receive deltait is a multiple of 250us, a time span of one second can identify up to 4000 packets.

packet chunkAnd receive deltathe use of is to reduce the RTCP packet size as much as possible. packet chunkDifferent encoding methods are used, and the arrival time information is added to the received RTP packets, and the arrival time is recorded by means of time intervals.

Guess you like

Origin blog.csdn.net/Doubao93/article/details/121622858