16.H264 network transmission RTP protocol analysis

One: RTP protocol:

1. The RTP protocol is actually
composed of two parts: the real-time transport protocol RTP (Realtime Transport Protocol) and the real-time transport control protocol RTCP (Realtime Transport Control Protocol);

2. The RTP protocol provides users with real-time transmission services of continuous media data based on multicast or unicast networks;
RTCP protocol is the control part of the RTP protocol, used for real-time monitoring of data transmission quality, and provides congestion control and flow control for the system;

3. RTP data packet: consists of two parts: RTP fixed header (Header) and payload (Payload);
among them, the meaning of the first 12 bytes of the header is fixed, and the payload data can be audio or video data.

2: Analysis of RTP header in H264 video transmission:

/*
	RTP数据头信息:12字节	
*/
typedef struct _RTP_FIXED_HEADER
{
    
    
    /**//* byte 0 */
    unsigned char csrc_len:4;        /**//* expect 0 */
    unsigned char extension:1;        /**//* expect 1, see RTP_OP below */
    unsigned char padding:1;        /**//* expect 0 */
    unsigned char version:2;        /**//* expect 2 */
	
    /**//* byte 1 */
    unsigned char payload:7;        /**//* RTP_PAYLOAD_RTSP */
    unsigned char marker:1;        /**//* expect 1 */
	
    /**//* bytes 2, 3 */
    unsigned short seq_no; 
	
    /**//* bytes 4-7 */
    unsigned  long timestamp;  
	
    /**//* bytes 8-11 */
    unsigned long ssrc;            /**//* stream number is used here. */
} __PACKED__ RTP_FIXED_HEADER;

Insert picture description here
1. Parameter description:
** The 0th byte
V:
RTP protocol version number, occupying 2 bits; the
current protocol version number is 2 (according to the RFC3984 protocol, the currently used RTP version number should be set to 0x10);

P:
padding identifier, occupying 1 bit:
if one or more additional octets are filled at the end of the message, and they are not part of the payload;

X:
extended flag, occupies 1 bit, if X=1;
then an extended header is followed by the RTP header;

CC:
CSRC counter, occupying 4 bits;
indicating the number of CSRC identifiers;

**The first byte
M:
Marker bit, which occupies 1 bit.
If the current NALU is the last NALU of an access unit, then the M is set to 1;
or when the current RTP packet is the last fragment of a NALU, the M is set to 1.
In other cases, the M bit remains 0.

PT:
Payload type, 7 bits;
for the H.264 video format, there is currently no default PT value specified, so a value greater than 95 can be used. Here, it can be set to 0x60 (decimal 96).

** Byte 2-3
SQ:
Sequence number, 16 bits;
the starting value of the sequence number is a random value here, set to 0, every time an RTP packet is sent, the sequence number value is increased by 1.

** The 4th-7th byte
TS:
Timestamp, 32 bits, 4 bytes; the
same as the serial number, the starting value of the timestamp is also a random value, here set to 0.
According to the RFC3984 protocol, it corresponds to the time The clock frequency must be 90000HZ.

** Bytes 8-11
SSRC:
synchronization source identification, 32 bits, 4 bytes;
SSRC should be randomly generated so that no two synchronization sources have the same SSRC identifier in the same RTP session.
There is only one synchronization source, so set it to 0x12345678;

Note: The first 12 bytes are the basic data of the RTP header, and the subsequent CRSC special source is optional;

CSRC:
special source: 32 bits, 4 bytes;
extended flag X: determine whether there is a special CSRC source in the RTP header, if the value is 1, there is;
CC counter: determine the number of special CSRC sources:

Three: NALU transmission mode in the network:

1. For each NALU, its size is also different according to the number it contains.
In the network, when the size of the message to be transmitted exceeds the maximum transmission unit MTU (Maximum Transmission Unit), packet fragmentation will occur.

2. The size of the maximum message (MTU) that can be transmitted in an Ethernet environment is 1500 bytes;

3. If the sent data packet is larger than the MTU value, the data packet will be disassembled for transmission,
which will generate a lot of data packet fragments, increase the packet loss rate and reduce the network speed.

4. For video transmission, if the RTP packet is larger than the MTU and is arbitrarily unpacked by the underlying protocol mechanism, it may cause delayed playback of the receiving end player or even failure of normal playback.
Therefore, for NALU units larger than MTU, unpacking must be performed.

5. The RFC3984 protocol provides 3 different RTP packaging schemes:

a: Single NALU Packet:
Only one NALU is encapsulated in an RTP packet. This scheme is generally used for those less than 950 bytes in the standard protocol;

b: Aggregation Packet:
Encapsulate multiple NALUs in one RTP packet. This packaging scheme can be used for smaller NALUs to improve transmission efficiency;

c: Fragmentation Unit:
One NALU is encapsulated in multiple RTP packets, and NALUs larger than 950 bytes in the standard protocol use this scheme for unpacking.

Note: Generally only use a and b schemes, but rarely use b schemes;

Four: RTP payload (Payload) description:

typedef struct _FU_INDICATOR
{
    
    
    //byte 0
    unsigned char TYPE:5;
	unsigned char NRI:2; 
	unsigned char F:1;    
	
}__PACKED__ FU_INDICATOR; /**//* 1 BYTES */


typedef struct _FU_HEADER
{
    
    
   	//byte 0
    unsigned char TYPE:5;
	unsigned char R:1;
	unsigned char E:1;
	unsigned char S:1;    
} __PACKED__ FU_HEADER; /**//* 1 BYTES */

Insert picture description here
Insert picture description here
1. For one RTP only encapsulated in one NALU:

	RTP有效载荷:载荷头Indicator + NALU数据;(这里的NALU数据指的是h264格式里的0x000001开头的数据);

完整的RTP数据 = RTP头(12字节) + 载荷头Indicator(1字节) + NALU数据(一帧h.264的数据长度)

2. For one NALU encapsulated in multiple RTP packets:

	RTP有效载荷:载荷头Indicator + 分片信息fu_header + NALU数据;(这里的NALU数据指的是分片的h264数据);

完整的RTP数据 = RTP头(12字节) + 载荷头Indicator(1字节)  + 分片信息fu_header(1字节) + NALU数据(一帧h.264分片后的数据长度)

Five: RTP group package code:


#define nalu_sent_len        	950
#define RTP_H264             	96
#define RTP_AUDIO            	97
#define timestamp_increse        	(90000 / 25)		//90KHz (1s 25帧) == 3600

int VENC_Sent(char *buffer,int buflen)
{
    
    
    int i;
	int is=0;
	int nChanNum=0;
	int ret = 0;

	RTP_FIXED_HEADER *rtp_hdr;
	FU_INDICATOR	 *fu_ind;
	FU_HEADER		 *fu_hdr;

	char *nalu_payload = NULL;
	int nAvFrmLen = 0;
	int nIsIFrm = 0;
	int nNaluType = 0;
	char sendbuf[500*1024+32] = {
    
    0};
	int	bytes = 0;
	char fixHeader[4] = {
    
    0x0A,0x0A,0x0A,0x0A};		
	static int frame_count = 0;

	for(is = 0; is < MAX_RTSP_CLIENT; is ++)
	{
    
    
		if(g_rtspClients[is].status != RTSP_SENDING)
		{
    
    
		    continue;
		}
		
		nAvFrmLen = buflen;
		struct sockaddr_in server;
		server.sin_family = AF_INET;
	   	server.sin_port = htons(g_rtspClients[is].rtpport[0]);          
	   	server.sin_addr.s_addr = inet_addr(g_rtspClients[is].IP);

		rtp_hdr = (RTP_FIXED_HEADER*)&sendbuf[0];
		
		rtp_hdr->payload = RTP_H264;
		rtp_hdr->version = 2;
		rtp_hdr->marker  = 0;
		rtp_hdr->ssrc    = htonl(10);
		
		if(nAvFrmLen <= nalu_sent_len)		//单包发送
		{
    
    
			rtp_hdr->marker = 1;
			rtp_hdr->seq_no = htons(g_rtspClients[is].seqnum++); 

			fu_ind = (FU_INDICATOR*)&sendbuf[12]; 
			fu_ind->F = 0; 
			fu_ind->NRI = nIsIFrm; 
			fu_ind->TYPE = nNaluType;

			nalu_payload = &sendbuf[13];	//未分片:头信息一共占13字节
			memcpy(nalu_payload, buffer, nAvFrmLen);
            g_rtspClients[is].tsvid = g_rtspClients[is].tsvid + timestamp_increse;            
			rtp_hdr->timestamp = htonl(g_rtspClients[is].tsvid);
			bytes = nAvFrmLen + 13 ;
			
			sendto(udpfd, sendbuf, bytes, 0, (struct sockaddr *)&server,sizeof(server));
		}
		else if(nAvFrmLen > nalu_sent_len)	// 分包发送
		{
    
    
			//printf("[%s:%d]:[yang] 222 nAvFrmLen = %d\n",__FUNCTION__,__LINE__,nAvFrmLen);
			int k = 0, l = 0;
			k = nAvFrmLen / nalu_sent_len;	//整包数的个数
			l = nAvFrmLen % nalu_sent_len;	//最后不足一包的数据的字节数
			int t = 0;        

            g_rtspClients[is].tsvid = g_rtspClients[is].tsvid + timestamp_increse;
            rtp_hdr->timestamp = htonl(g_rtspClients[is].tsvid);            

			while(t <= k)
			{
    
    
				rtp_hdr->seq_no = htons(g_rtspClients[is].seqnum++);
				
				if(t == 0)		//
				{
    
    
					rtp_hdr->marker = 0;
					
					fu_ind =(FU_INDICATOR*)&sendbuf[12];
					fu_ind->F = 0; 
					fu_ind->NRI = nIsIFrm;
					fu_ind->TYPE = 28;					
	
					fu_hdr = (FU_HEADER*)&sendbuf[13];	//13
					fu_hdr->E = 0;
					fu_hdr->R = 0;
					fu_hdr->S = 1;
					fu_hdr->TYPE = nNaluType;

					nalu_payload = &sendbuf[14];
					memcpy(nalu_payload,buffer,nalu_sent_len);
					bytes = nalu_sent_len + 14;	//分片数据:头信息一共占14字节
					sendto(udpfd, sendbuf, bytes, 0, (struct sockaddr *)&server,sizeof(server));
					t++;
	
				}
				else if(k == t)	//发送最后不足一包的数据
				{
    
    
					rtp_hdr->marker = 1;
					
					fu_ind =(FU_INDICATOR*)&sendbuf[12]; 
					fu_ind->F= 0 ;
					fu_ind->NRI= nIsIFrm ;
					fu_ind->TYPE=28;

					fu_hdr =(FU_HEADER*)&sendbuf[13];	// 13
					fu_hdr->R = 0;
					fu_hdr->S = 0;
					fu_hdr->TYPE = nNaluType;
					fu_hdr->E = 1;
					nalu_payload = &sendbuf[14];
					memcpy(nalu_payload,buffer + t*nalu_sent_len, l);
					bytes = l + 14;	//分片数据:头信息一共占14字节

					sendto(udpfd, sendbuf, bytes, 0, (struct sockaddr *)&server,sizeof(server));
					t++;
				}
				else if(t < k && t != 0)	//发送 1400大小的数据包
				{
    
    
					rtp_hdr->marker=0;
					
					fu_ind = (FU_INDICATOR*)&sendbuf[12]; 
					fu_ind->F = 0; 
					fu_ind->NRI = nIsIFrm;
					fu_ind->TYPE = 28;					
					fu_hdr = (FU_HEADER*)&sendbuf[13];	// 13
					fu_hdr->R = 0;
					fu_hdr->S = 0;
					fu_hdr->E = 0;
					fu_hdr->TYPE = nNaluType;
					
					nalu_payload = &sendbuf[14];	
					memcpy(nalu_payload, buffer + t*nalu_sent_len, nalu_sent_len);
					bytes = nalu_sent_len + 14;	//分片数据:头信息一共占14字节
					sendto(udpfd, sendbuf, bytes, 0, (struct sockaddr *)&server,sizeof(server));
					t++;
				}
			}
		}

	}
}


Explanation:
1. The marker (M) of the last packet of r data sent in a single packet or sub-packet is 1 and the rest are 0;

rtp_hdr->marker = 1;

2. The timestamp has a fixed increase of 3600 compared to the previous frame: ie (90000Hz / 25 == 3600)

g_rtspClients[is].tsvid = g_rtspClients[is].tsvid + timestamp_increse;
rtp_hdr->timestamp = htonl(g_rtspClients[is].tsvid); 

Note: The timestamp should be converted to network byte order htonl (big endian mode);

3. When sending a NALU in fragments:

fu_hdr->S = 1;	//是否为分片的第一包数据,是则为1				
fu_hdr->R = 0;
fu_hdr->E = 0;	//是否为分片的最后一包数据,是则为1	

Note: For the first packet, fu_hdr->S must be 1, and for the
last packet, fu_hdr->E must be 1, and all
other 3 bits are 0.

4. RTP payload type:
Insert picture description here

Guess you like

Origin blog.csdn.net/yanghangwww/article/details/112006433