RTP/RTCP protocol detailed explanation

RTP/RTCP application background

With the development of multimedia network applications, transmission protocols for universal, real-time interactive applications of network multimedia - Real-Transport Protocol (RTP) and Real-Transport Control Protocol (RTCP) have emerged. born.

1. Characteristics of RTP protocol

RTP protocol runs on top of TCP/IP protocol,
RTP usually runs in user space, which is located on top of UDP or TCP protocol. From a workflow perspective, since RTP runs in user space and is linked to the application layer protocol, it looks more like an application layer protocol. On the other hand, it is a general protocol that has nothing to do with specific applications. It encapsulates multimedia data at the application layer and then uses UDP, IP and lower-layer protocols to realize the transmission of multimedia data.

When an application starts an RTP session, two ports are used: one for RTP and one for RTCP. At the same time, it should be noted that unlike other application layer protocols that allocate a well-known port number, the RTP session needs to select an unused even UDP port number among the temporary port numbers (between 1025 and 65 535). For example, if the port number selected by RTP is 1210, then RTCP belonging to the same session will select an odd port number plus 1, that is, 1211. Therefore, from the perspective of the TCP/IP protocol system, it should be located below the application layer and above UDP. It is a transport layer protocol dedicated to network applications with real-time requirements.

RTP protocol provides end-to-end transmission services
Multimedia data consists of audio, video, text and other possible data streams. These data streams are sent to the RTP library. The RTP library software compresses and encodes audio, video, and text data streams according to the relationship between them and multiplexes them into RTP messages (RFC 3550 uses "RTP Packet"), plus sockets (Socket), through UDP software is encapsulated into a UDP message. The destination host transmits the multimedia data encapsulated in the received RTP message to the application layer. The application layer player is responsible for playing multimedia data.

UDP messages are encapsulated in ordinary IP packets for transmission, and all routers in the transmission path will not provide any special services to the packets. RTP does not emphasize the need for resource reservation protocol (RSVP) support. RTP provides end-to-end transmission services for real-time applications at the application layer and does not provide any QoS guarantees.

2. Structure of RTP protocol

The RTP header consists of a 12-byte fixed-length header and an optional sub-source identifier. The fixed header of 12 bytes in length includes the following fields.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | V |P|X|  CC   |M|    PT       |               SN              |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                        Time stamp                             |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                           SSRC                                |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                           CCRC                                |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


+ 版本(V)  
版本字段长度为2比特,目前使用的是版本2。

+ 填充(P) 
填充(P)字段的长度为1比特。在某些特殊情况下,需要对应用层数据进行加密,这就要求每个数据块有确定的长度,必须是4字节的整数倍。在有填充字节的情况下,填充位P=1。在数据部分的最后一个字节值用来表示填充的字节数。

+ 扩展(X)  
扩展(X)字段的长度为1比特。X=1表示RTP报头之后有扩展报头。实际上,RTP很少使用扩展报头。

+ 参与源数量(CC)  
参与源(CSRC Counter,CC)字段的长度为4比特。CC设置为最大值时,表示一次会话最多有15个参与源。

+ 标记(M) 
标记(M)字段的长度为1比特。M=1表示该RTP报文有特殊意义。例如,应用程序可以用该位表示视频流的每帧开始,也可以表示视频流传输结束。

+ 有效载荷类型(PT)  
有效载荷类型Payload type字段的长度为8比特。

+ 序号(SN)  
序号字段的长度为16比特,用来给RTP报文编号。在一次RTP会话中,第一个RTP报文的编号是随机产生的,后续的每个报文序号加1。接收端可以根据序号来判断RTP报文是否丢失或乱序。

+ 时间戳(Time stamp)  
时间戳字段的长度为32比特,用于指出RTP报文之间的时间关系。在一次会话开始时,第一个RTP报文的时间戳初始值是随机产生的。RTP没有规定时间戳的粒度(Granularity),它取决于有效载荷类型。

+ 同步源标识符(SSRC)  
同步源标识符(Synchronous SouRCe identifier,SSRC)字段的长度为32比特,用来表示RTP流的来源。如果一次会话中只有一个源端,那么SSRC值就表示这个源端。

+ 参与源标识符(CCRC)  
参与源标识符(Contributing SouRCe identifier,CCRC)字段的长度为32比特,用来标识参与源的源端。从长度为4比特的CC字段可以知道,一次会话的参与源数量最多为15个。

3. The relationship between RTCP protocol and RTP protocol

The source uses RTCP messages to synchronize different media streams in a session. For example, in a video conferencing application, each source generates two independent media streams, one for transmitting video and one for transmitting audio. At this time, it is necessary to associate the timestamps in these RTP packet headers with the video and audio sampling clocks. Since the RTCP message sent by the source contains the timestamp and real time of the RTP message stream associated with it, the receiving end can synchronize the playback of video and audio through the association provided by the RTCP message.

In fact, the RFC 3550 document defines two parts. One part is the protocol for transmitting multimedia data streams (RTP), and the other part is the real-time transmission control protocol (RTCP).

RTCP and RTP protocols are in a cooperative relationship. RTP and RTCP can be used in a multimedia application at the same time, and both are encapsulated in UDP packets for transmission.

Audio and video data streams are encapsulated in the payload of RTP messages, while RTCP messages do not encapsulate any audio or video data streams.

4. RTCP messages

There is an 8-bit packet type field in the RTCP header. Different packet type field values ​​represent different types of RTCP packets. For example, the message type field value is 200, which indicates the RTCP message reported by the sender.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | V |P|    RC   |   PT=SR=200   |              length           |  header
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                        SSRC of sender                         |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                  NTP Time stamp(most significant word)        | sender info
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                  NTP Time stamp(least significant word)       |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                        RTP timestamp                          |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                  sender's packet count                        |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                   sender's octet count                        |
 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
 |                   SSRC_1(SSRC of first source)                | report block 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | fraction lost |    cumulative number of packets lost          |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |          extended hidghest sequence number received           |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |              interarriveal jitter                             |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |               SSRC_1(SSRC of second source)                   | report block 2
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                 ................                              |
 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
 |                  profile-secific extensions                   |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
+ 版本(V)
  同RTP包头域
+ 填充(P)
  同RTP包头域
+ 接收报告计数器(RC)
  该SR包中的接收报告块的数目,可以为0
+ 包类型(PT)
  SR包是200
+ 长度域(Length)
  其中存放的是该SR包以32比特为单位的总长度减一
+ 同步源(SSRC)
  SR包发送者的同步源标识符。与对应RTP包中的SSRC一样
+ NTP Timestamp(Network time + protocol)
  SR包发送时的绝对时间值,NTP的作用是同步不同的RTP媒体流
+ RTP Timestamp
  与NTP时间戳对应,与RTP数据包中的RTP时间戳具有相同的单位和随机初始值
+ Sender’s packet count
  从开始发送包到产生这个SR包这段时间里,发送者发送的RTP数据包的总数. SSRC改变时,这个域清零
+ Sender`s octet count
  从开始发送包到产生这个SR包这段时间里,发送者发送的净荷数据的总字节数(不包括头部和填充),发送者改变其SSRC时,这个域要清零
+ 同步源n的SSRC标识符
  该报告块中包含的是从该源接收到的包的统计信息
+ 丢失率(Fraction Lost)
  表明从上一个SR或RR包发出以来从同步源n(SSRC_n)来的RTP数据包的丢失率
+ 累计的包丢失数目
  从开始接收到SSRC_n的包到发送SR,从SSRC_n传过来的RTP数据包的丢失总数
+ 收到的扩展最大序列号
  从SSRC_n收到的RTP数据包中最大的序列号,
+ 接收抖动(Interarrival jitter)
  RTP数据包接受时间的统计方差估计
+ 上次SR时间戳(Last SR,LSR)
  取最近从SSRC_n收到的SR包中的NTP时间戳的中间32比特。如果目前还没收到SR包,则该域清零。
+ 上次SR以来的延时(Delay since last SR,DLSR)
  上次从SSRC_n收到SR包到发送本报告的延时。

5. RTCP message type

  • Sender Report (SR)
    A session between the sender and the receiver contains many RTP streams. Each time the sender sends an RTP stream, it will send an SR message. Absolute time is very important for multimedia transmission. When transmitting a video signal, it is actually necessary to transmit the audio stream and image stream at the same time. In this way, when playing a video program, the audio stream and image stream can be synchronized through the timestamp and absolute time of the RTP message.

  • Receiver Report (RR)
    Each time the receiver receives an RTP stream, it will send an RR message. The receiving end can use RTCP messages to periodically feed back QoS-related data to the sending end. The sender can understand the current delay, delay jitter, and packet loss rate of the network based on the information fed back by the RTCP message in order to determine the data transmission rate. If the network communication status is good, the sending end can dynamically change the encoding algorithm to improve the playback quality of multimedia information.

  • Sender Description Report (SDES)
    The sender periodically sends SDES messages through multicast, giving the canonical name of the session participant. The canonical name is a string of session participant email addresses.

  • End (BYE)
    The end (BYE) message is used to close a data flow. In video conferencing applications, a sender announces its exit from the conference through an end message.

  • Specific application (APP)
    Specific application (APP) messages are used by applications to define a new RTP message type.

    Under normal circumstances, RTCP packets should not occupy more than 5% of the network bandwidth. If a sender is sending a video stream at a rate of 2Mbps, the bandwidth occupied by the RTCP packets of this node must be less than 100kbps. In a specific implementation, 75% of this bandwidth (75kbps) is usually allocated to the receiving end, and the remaining 25% (25kbps) is reserved for the receiving end. If there are n receivers in a multicast situation, the bandwidth used by each receiver to send RTCP messages should be controlled within 75/n (kbps).

Guess you like

Origin blog.csdn.net/EBDSoftware/article/details/127617034