Introduction to RTP package

RTP (Real-time Transport Protocol) is a protocol used for audio and video communication. It defines a standard data packet format called RTP data packet. Our audio and video data is encapsulated in RTP packets.

Format introduction

The following is the format of the RTP packet:

 0               1               2               3              
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X|  CC   |M|     PT      |       sequence number         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           timestamp                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           synchronization source (SSRC) identifier            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            contributing source (CSRC) identifiers             |
|                             ....                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     RTP extension ID        |           length                |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        RTP extension data                     |
|                             ....                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           payload                             |
|                             ....                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The above packet format consists of a 12-byte fixed header and optional variable-length extensions and payload. The following is an explanation of each field:

  • V: 2 digits, protocol version number. Currently 2.

  • P: 1 bit, padding bit flag. Set to 1 when padding is present, 0 otherwise.

  • X: 1 bit, extension bit flag. Set to 1 when extension header is present, 0 otherwise.

  • CC: 4 digits, number of contributors (SSRC). Represents the number of all SSRC identifiers followed by the fixed header.

  • M: 1 bit, mark bit. Marks whether the current packet is the last packet of a fragment.

  • PT: 7 bits, payload type. Specifies the type of data carried in RTP packets, such as audio or video.

  • sequence number: 16 bits, sequence number. The sequence number that identifies each RTP packet.

  • timestamp: 32 bits, timestamp. Used to synchronize the playback time of audio and video data, and calculate delay and jitter.

  • SSRC: 32 bits, synchronization source identifier. Identifies the source of the RTP packet.

  • CSRC: 0-15 32-bit contributor identifiers. They represent the contributors of RTP packets, i.e. the sources of audio data after being mixed by the mixer.

  • RTP extension ID: 16 bits, RTP extension header identifier. Optional.

  • length: 16 bits, RTP extension header length. Optional.

  • RTP extension data: RTP extension header data. Optional.

  • payload: RTP payload. It contains audio and video data.

Load introduction

The payload contains the valid audio and video data that is actually to be transmitted.

picture:

The audio data in the RTP package can use a variety of different encoding formats, such as PCM, MP3, AAC, G.711, etc.

Video data in RTP packets, such as H.264, VP8, VP9, ​​H.265, AV1, etc., can be used as payloads

That is to say, MP3 is an audio encoding format, and PCM is a naked stream (digital audio format), which can be placed in RTP packets for transmission. The format of the payload is reflected in the PT field in the RTP packet header.

RTP and RTCP

RTP and RTCP packets are a pair of associated protocols, which are transmitted simultaneously during audio and video transmission.

The RTP protocol is used to transmit audio and video data in real time, while the RTCP protocol is used to transmit control information about the audio and video stream.

RTCP contains statistical data about audio and video streams, such as transmission rate, packet loss rate, jitter, etc., and also contains timing information used to synchronize multiple media streams.

RTCP packets are similar to RTP packets, but they have different port numbers.

Why are the RTP and RTCP port numbers different by 1?

Typically, RTP data streams are transmitted using even port numbers, while RTCP uses odd port numbers.

For example, if RTP uses UDP port number 5004, then RTCP uses UDP port number 5005.

During the transmission process, RTP and RTCP packets are delivered at the same time.

When RTP data packets are transmitted, the RTCP protocol will also send corresponding control information so that the receiving end can understand the status of the current audio and video streams and adjust and synchronize them.

This design is to facilitate the implementation and configuration of the protocol. Using adjacent port numbers makes it easier for network devices such as firewalls and routers to identify and forward RTP and RTCP traffic. It also helps ensure that the two traffics follow the same path in the network, thus avoiding issues such as latency and jitter. To improve the stability and quality of audio and video transmission.

RTCP packet format

The format of the RTCP packet is as follows:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |V=2|P|    RC   |   PT=SR=200   |             length            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         SSRC of sender                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |              NTP timestamp, most significant word             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |             NTP timestamp, least significant word             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         RTP timestamp                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     sender's packet count                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      sender's octet count                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           report block                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           report block                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      .                               .                               .
      .                               .                               .
      .                               .                               .
      |                           report block                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

There are two versions of this format: V=1 and V=2, with V=2 being the more commonly used version. The meanings of each field in this format are as follows:

  • V (2 bits): Version number, used to specify the version of the RTCP package, usually 2.

  • P (1 bit): Padding flag, if 1, additional padding bytes are included at the end of the RTCP packet.

  • RC (5 bits): Receiver number counter, indicating the number of receivers sent by the sender to the RTCP packet.

  • PT (8 bits): Packet type, used to indicate the type of RTCP packet, such as 200 indicating SR (Sender Report).

  • length (16 bits): Contains the length of the RTCP packet (in 32-bit words), excluding the header.

  • SSRC of sender (32 bits): The synchronization source (SSRC) identifier of the sender, used to uniquely identify the sender sending RTCP packets.

  • NTP timestamp (64 bits): used to indicate the sender's real-time clock and can be used to synchronize multiple media streams.

  • RTP timestamp (32 bits): used to indicate the timestamp of RTP packets.

  • sender's packet count (32 bits): Indicates the number of RTP packets sent by the sender since the beginning of this session.

  • sender's octet count (32 bits): Indicates the total number of bytes of RTP packets sent by the sender since the beginning of the session.

  • report block: used to contain statistics about each recipient

Guess you like

Origin blog.csdn.net/csdn_zmf/article/details/129301091
RTP