RTMP protocol analysis (1) - basic understanding

1 Introduction

Everyone knows that many video apps use the RTMP format protocol. This is an internationally used protocol. Although I have made a live broadcast app myself, I have never had time to understand the basis of this protocol in depth. This article begins to let us gradually uncover the mystery of the RTMP protocol, gradually entering the principle layer and bottom layer from the application layer.

2. Definition

The following content is partly from Baidu Encyclopedia

RTMP is an acronym for Real Time Messaging Protocol . The protocol is based on TCP and is a protocol family, including RTMP basic protocol and various variants such as RTMPT/RTMPS/RTMPE. RTMP is a network protocol designed for real-time data communication. It is mainly used for audio, video and data communication between the Flash/AIR platform and streaming media/interactive servers that support the RTMP protocol. Software supporting this protocol includes Adobe Media Server/Ultrant Media Server/red5, etc.

RTMP is the abbreviation of Routing Table Maintenance Protocol ( routing table maintenance protocol). In the AppleTalk protocol group, Routing Table Protocol (RTMP, Routing Table Protocol) is a transport layer protocol that establishes and maintains routing tables in AppleTalk routers . RTMP is based on Routing Information Protocol (RIP). Like RIP, RTMP uses hop counts as a routing metric. A data packet is sent from the source network to the destination network, and the calculation result of the number of routers or other intermediate nodes that must pass through is the hop count.

Let's take a look at the two schematic diagrams to understand.

3. Protocol overview

RTMP (Real Time Messaging Protocol) is an open protocol developed by Adobe Systems for audio, video and data transmission between Flash players and servers. It has many variants:

  • RTMP works on top of TCP and uses port 1935 by default;

  • RTMPE adds encryption function on the basis of RTMP;

  • RTMPT is encapsulated on the HTTP request and can penetrate the firewall ;

  • RTMPS is similar to RTMPT, adding the security function of TLS/SSL.

4. Protocol details

The RTMP protocol (Real Time Messaging Protocol) is used by Flash for the transmission of objects, video, and audio. This protocol is built on top of the TCP protocol or the polled HTTP protocol .

The RTMP protocol is like a container used to hold data packets. These data can be either AMF format data or video/audio data in FLV.

A single connection can transmit multiple network streams through different channels, and the packets in these channels are all transmitted in fixed-size packets. Network connection (Connection) Simple code for an Actionscript to connect and play a stream:

var videoInstance:Video = your_video_instance;
var nc:NetConnection = new NetConnection();
var connected:Boolean = nc.connect("rtmp:/localhost/myapp");
var ns:NetStream = new NetStream(nc);
videoInstance.attachVideo(ns);
ns.play("flvName");

The default port is 1935

5. Handshake request and response

5.1 Handshake process

Client → Server: Send a handshake request to the server. This is not part of the protocol package. The first byte of the handshake request is (0×03), followed by 1536 bytes. Although it seems that the content of this part is not crucial to the RTMP protocol, it should not be treated arbitrarily.

Server → Client: The server responds to the handshake request to the client, and this part of the data is still not part of the RTMP protocol. The start byte of the response is still (0x03), but it is followed by two packets with a length of 1536 bytes (3072 bytes in total). The first 1536 blocks look like they could be anything, even Null. The second code block of 1536 is the content of the handshake request sent by the client to the server in the previous step.

Client→Server: The second 1536-byte data block that the server responded to the client in the previous step .

At this point, the handshake between the client and the server is over, and the package content of the RTMP protocol will be sent below.

Client → Server : Send a connection packet to the server. Server → Client : The server responds.

... .... etc... ...

5.2 RTMP data type

0×01 Chunk Size changes the chunk size for packets
0×02 Unknown anyone know this one?
0×03 Bytes Read send every x bytes read by both sides
0×04 Ping ping is a stream control message, has subtypes
0×05 Server BW the servers downstream bw
0×06 Client BW the clients upstream bw
0×07 Unknown anyone know this one?
0×08 Audio Data packet containing audio
0×09 Video Data packet containing video data
0x0A - 0×11 Unknown anyone know?
0×12 Notify an invoke which does not expect a reply
0×13 Shared Object has subtypes
0×14 Invoke like remoting call, used for stream actions too.
Shared Object data type
0×01 Connect
0×02 Disconnect
0×03 Set Attribute
0×04 Update Data
0×05 Update Attribute
0×06 Send Message
0×07 Status
0×08 Clear Data
0×09 Delete Data
0x0A Delete Attribute
0x0B
Initial Data

5.3 RTMP packet structure

The RTMP packet contains a fixed-length header and a body with a maximum length of 128 bytes. The header can be any of the following four lengths: 12, 8, 4, or 1 byte(s). The first two bits of the first byte are very important, it determines the length of the packet header, it can use the mask 0xC0 for "AND" calculation. The possible header length Bits Header Length is listed below.

00 12 bytes
01 8 bytes
10 4 bytes
11 1 byte

In fact, the RTMP packet structure uses the AMF format.

The following is a process of sending a stream from the client to the server:

  • Client → Server : Send a request to create a stream

  • Server → Client : returns an index number representing the stream

  • Client → Server : start sending

  • Client → Server: Send video and audio data packets (these packets are in the same channel (channel) and are uniquely identified by the index number of the stream)

5.4 RTMP Chunk Stream - RTMP block stream

Chunk Stream is a logical abstraction of the stream that transmits RTMP Chunk. Information about RTMP between the client and the server is communicated on this stream. The operation on this stream is also the focus of our attention on the RTMP protocol.

Message

Message refers to a message that meets the protocol format and can be divided into Chunks and sent. The fields contained in the message are as follows.

  • Timestamp: The timestamp of the message (but not necessarily the current time, which will be introduced later), 4 bytes.

  • Length (length): refers to the length of the Message Payload (message load), that is, the data of audio and video information, 3 bytes.

  • TypeId (type Id): the type Id of the message, 1 byte.

  • Message Stream ID (message stream ID): The unique identifier of each message. When dividing into Chunk and restoring Chunk to Message, it is based on this ID to identify whether it is the Chunk of the same message. It is 4 bytes and ends with Stored in little-endian format.

Chunking (Message chunking)

RTMP does not use Message as the unit when sending and receiving data, but splits Message into Chunk for sending, and the next Chunk must be sent only after one Chunk is sent. The MessageID in each Chunk represents which Message it belongs to, and the receiving end will also assemble the Chunk into a Message according to this ID.

Why does RTMP split Message into different Chunks? By splitting, a Message with a large amount of data can be split into smaller "Messages", which can prevent low-priority messages from continuously sending and blocking high-priority data. For example, during video transmission, it will include Video frames, audio frames and RTMP control information, if audio data or control data is continuously sent, it may cause blockage of video frames, and then cause the most annoying stuttering phenomenon when watching videos. At the same time, for messages with a small amount of data, the information can be compressed through the fields of the Chunk Header, thereby reducing the amount of information transmission.

The default size of the Chunk is 128 bytes. During the transmission process, the maximum value of the Chunk data volume can be set through a control message called Set Chunk Size. The sending end and the receiving end will each maintain a Chunk Size, which can be set separately. To change the maximum size of the Chunk sent by your side. A larger Chunk reduces the time to calculate each chunk and thus reduces the CPU usage, but it will take more time to send, especially in the case of a low-bandwidth network, which is likely to block more important information later transmission. A smaller Chunk can reduce this blocking problem, but a small Chunk will introduce too much additional information (Header in the Chunk), and a small number of multiple transmissions may also cause transmission interruptions, resulting in the inability to fully utilize the advantages of high bandwidth. Therefore, it is not suitable for transmission in high bit rate streams. When actually sending, you should try different Chunk Sizes for the data to be sent, obtain the appropriate Chunk size through packet capture analysis and other means, and dynamically adjust the Chunk size according to the current bandwidth information and the size of the actual information during the transmission process size, so as to maximize CPU utilization and reduce the probability of information blocking.

Chunk Format - Chunk format

Let's talk about the composition of the quick format.

  • Basic Header: It is the basic header information.

It includes chunk stream ID (stream channel Id) and chunk type (chunk type). Chunk stream id is generally abbreviated as CSID, which is used to uniquely identify a specific stream channel. The chunk type determines the format of the subsequent Message Header. The length of the Basic Header may be 1, 2, or 3 bytes, and the length of the chunk type is fixed (accounting for 2 bits, note that the unit is bit, bit). The length of the Basic Header depends on the size of the CSID. Under the premise of these two fields, it is best to use as few bytes as possible to reduce the amount of data increased due to the introduction of the Header.

The RTMP protocol supports user-defined CSID between [3, 65599], 0, 1, and 2 are reserved by the protocol to represent special information.

0 means that the Basic Header occupies a total of 2 bytes, and the CSID is between [64, 319]; 1 means that it occupies 3 bytes, and the CSID is between [64, 65599]; 2 means that the chunk is control information and some commands information, which will be described in detail later.

The length of the chunk type is fixed at 2 bits, so the length of the CSID is one of (6=8-2), (14=16-2), (22=24-2). When the Basic Header is 1 byte, the CSID occupies 6 bits, and 6 bits can represent up to 64 numbers. Therefore, in this case, the CSID is between [0, 63], and the user-defined range is [3, 63].

Let's take a look at the byte diagram of the different bytes of the Basic Header.

When the Basic Header is 1 byte

When the Basic Header is 2 or 3 bytes

It should be noted that the Basic Header adopts the little-endian storage method, and the order of magnitude of the later bytes is higher. It can be seen that the CSIDs that can be represented by the 2-byte and 3-byte Basic Header have intersection [64, 319], but in actual implementation, the principle of the least number of bytes should be used to represent the 2-byte way to represent the CSID of [64, 319].

  • Message Header

Contains a description of the actual message (which may be complete or partial) to be sent. The format and length of the Message Header depends on the chunk type of the Basic Header. There are 4 different formats, which are controlled by the fmt field in the Basic Header mentioned above. The first format can represent all the data of the other three representations, but since the other three formats are based on the representation of the difference quantization of the previous chunk, it can represent the same data more concisely, and should be used as much as possible in actual use. Fewer bytes represent data with the same meaning. The following four formats of Message Headers are introduced in descending order of the number of bytes.

(1) When type=0, the Message Header occupies 11 bytes, and it can represent the other three types of data that can be represented, but the first chunk at the beginning of the chunk stream and the timestamp in the header information are backward (that is, the value is the same as This format must be used when the size of the previous chunk is smaller than that of the previous chunk, which usually occurs when playing backwards.

Message Header type 0

(2) When type=1, the Message Header occupies 7 bytes, and the 4 bytes representing the msg stream id are omitted, indicating that this chunk is in the same stream as the chunk sent last time. If the sending end is only the same as the peer end When there is a stream link, you can try to adopt this format as much as possible.

Message Header type 1

(3) When type=2, the Message Header occupies 3 bytes. Compared with the format of type=1, 3 bytes representing the length of the message and 1 byte representing the type of the message are omitted, indicating the chunk and the last sending The stream where the chunks are located, the length of the message, and the type of the message are all the same. The remaining three bytes represent timestamp delta, use the same type=1.

Message Header type 2

(4) 0 bytes! ! ! Well, it means that the Message Header of this chunk is exactly the same as the previous one, so naturally there is no need to transmit it again. When it follows the chunk of Type=0, it means that the timestamp of the previous chunk is the same. When are the timestamps the same? That is, a Message is split into multiple chunks, and this chunk belongs to the same Message as the previous chunk. And when it follows the chunk of Type=1 or Type=2, it means that the difference with the timestamp of the previous chunk is the same. For example, Type=0, timestamp=100 of the first chunk, Type=2, timestamp delta=20 of the second chunk, indicating that the timestamp is 100+20=120, and Type=3 of the third chunk, indicating timestamp delta =20, the timestamp is 120+20=140.

  • Extended Timestamp

We mentioned above that there will be a timestamp and a timestamp delta in the chunk, and they will not exist at the same time. Only when one of the two is greater than the maximum value that can be represented by 3 bytes, 0xFFFFFF=16777215, will it be used This field represents the real timestamp, otherwise this field is 0. The extended timestamp occupies 4 bytes, and the maximum value that can be represented is 0xFFFFFFFF=4294967295. When the extended timestamp is enabled, the timestamp field or timestamp delta must be set to 1, indicating that the timestamp field should be extended to extract the real timestamp or timestamp difference. Note that extended timestamps store the full value, not the subtracted or differenced timestamp.

  • Chunk Data (block data):

The protocol-independent data that the user really wants to send, the length is between [0, chunkSize].

Protocol Control Message

The chunk stream in RTMP will use some special values ​​to represent the control messages of the protocol. Their Message Stream ID must be 0 (representing control flow information), CSID must be 2, and Message Type ID can be 1, 2, 3, 5 , 6. The specific representative messages will be explained in turn below. The receiving end of the control message will ignore the timestamp in the chunk, and it will take effect immediately after receiving it.

  • Set Chunk Size(Message Type ID=1): Set the maximum number of bytes that can be carried by the Data field in the chunk. The default is 128B. You can set the size of the chunk Size (not less than 128B) by sending this message during communication, and The two parties in the communication will each maintain a chunkSize, and the chunkSize at both ends is independent.

  • Abort Message (Message Type ID=2): When a Message is divided into multiple chunks and the receiving end only receives part of the chunks, this control message is sent to indicate that the sending end will no longer transmit chunks of the same Message, and the receiving end receives this These incomplete chunks should be discarded after the message. Only one CSID is required in the Data data, which means that all received chunks of this CSID are discarded.

  • Acknowledgment(Message Type ID=3): When the size of the received message from the other end is equal to the window size (Window Size), the receiving end will give back an ACK to the sending end to inform the other end that it can continue to send data. The window size refers to the maximum number of bytes that can be sent before receiving the ACK returned by the receiving end. The returned ACK will contain the number of bytes received since the previous ACK was sent.

  • Window Acknowledgment Size(Message Type ID=5): The maximum number of bytes that the sender can send between receiving two ACKs returned by the receiver.

  • Set Peer Bandwidth(Message Type ID=6): Limit the output bandwidth of the peer. After receiving the message, the receiving end will set the Window ACK Size in the message to limit the size of the message that has been sent but has not received feedback to limit the sending bandwidth of the sending end. If the Window ACK Size in the message is different from the size sent to the sender last time, a Window Acknowledgment Size control message should be returned.

(1) Hard(Limit Type=0): The receiver should set the Window Ack Size to the value in the message (2) Soft(Limit Type=1): The receiver can set the Window Ack Size to the value in the message, or The original value can be saved (provided that the original Size is smaller than the Window Ack Size in the control message) (3)Dynamic(Limit Type=2): If the Limit Type in the last Set Peer Bandwidth message is 0, this time Also treat it as Hard, otherwise ignore this message and do not set the Window Ack Size.

5.5 Different types of RTMP Messages

  • Command Message (command message, Message Type ID=17 or 20): Indicates the command message transmitted between the client and the server to perform certain operations on the opposite end, such as connect means to connect to the opposite end, if the opposite end agrees to the connection, it will record Sending information and returning a successful connection message, publish means to start pushing streams to the other party, and the receiving end is ready to accept the stream information sent by the other end after receiving the command, and the more common Command Message will be introduced in detail later. When the information is coded by AMF0, the Message Type ID=20, and when the message is coded by AMF3, the Message Type ID=17.

  • Data Message (data message, Message Type ID=15 or 18): pass some metadata (MetaData, such as video name, resolution, etc.) or some user-defined messages. When the information is coded using AMF0, the Message Type ID=18, and when the message is coded by AMF3, the Message Type ID=15.

  • Shared Object Message (shared message, Message Type ID = 16 or 19): indicates a Flash type object, composed of a collection of key-value pairs, used for multiple clients and multiple instances. When the information is coded by AMF0, the Message Type ID=19, and when the message is coded by AMF3, the Message Type ID=16.

  • Audio Message (audio information, Message Type ID=8): audio data.

  • Video Message (video information, Message Type ID=9): video data.

  • Aggregate Message (aggregate information, Message Type ID=22): a collection of multiple RTMP sub-messages.

  • User Control Message Events (user control message, Message Type ID=4): inform the other party to execute the user control events contained in the message, such as the Stream Begin event to inform the other party to start transmission of stream information. Unlike the Protocol Control Message mentioned above, this is at the RTMP protocol layer, not at the RTMP chunk stream protocol layer, which is easy to confuse. When the information is sent in the chunk stream, Message Stream ID=0, Chunk Stream Id=2, and Message Type Id=4.

5.6 RTMP-based streaming and streaming playback process

Let’s take a look at the streaming process first.

Streaming process

Look at the flow process again

Streaming playback process

reference article

1. Take you through RTMP

postscript

To be continued~~~

Original link: RTMP protocol analysis (1) - basic understanding - short book

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

 

Guess you like

Origin blog.csdn.net/yinshipin007/article/details/130029829