hi3516a——H.264 data package is RTP data package (with package source code and detailed analysis)

Preface

When debugging HI3516A for RTP streaming media playback, it is necessary to know how to encapsulate H.264 data packets into RTP data packets and send them out. This article will analyze in detail the protocol format and source code of H.264 data packet encapsulation into RTP data packet.
Hardware platform: hi3516a
Software platform: Hi3516A_SDK_V1.0.5.0

H.264 data packets are RTP data packets. I have found a lot of information on the Internet, but they are not complete, so I tried to sort out a more comprehensive analysis combined with examples. It is easier to understand when combined with specific examples. The article draws on many articles, and I listed them at the end of the article, and I would like to express my thanks.
Selfless sharing starts with me!

H.264 data packet package is the source code of RTP data packet

The following is the source code of H.264 data packet encapsulation as RTP data packet, which has a lot of printing information added, and you can remove it yourself if you don't need it.

/**************************************************************************************************

 RTSP（Real Time Streaming Protocol），RFC2326

 RTP ：Real Time Protocol 实时协议

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X|  CC   |M|     PT      |       sequence number         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           timestamp                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           synchronization source (SSRC) identifier            |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|            contributing source (CSRC) identifiers             |
|                             ....                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
|                                                               |
|                                                               |
|               payload                                         |
|                                                               |
|                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               :...OPTIONAL RTP padding        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+



************************************************************************************************/

extern unsigned char sps_tmp[256];
extern unsigned char pps_tmp[256];
extern int sps_len;
extern int pps_len;

static int SendNalu264(HndRtp hRtp, char *pNalBuf, int s32NalBufSize)
	
{

	printf("input H.264 raw data----count=%ld------\r\n",s32NalBufSize);
	int i=0;
	printf("0x");
	while(i<100)
	{
		printf("%x ",pNalBuf[i]);
		i++;
	}
	printf("......\r\n");

	

    char *pNaluPayload;
    char *pSendBuf;
    int s32Bytes = 0;
    int s32Ret = 0;
    struct timeval stTimeval;
    char *pNaluCurr;
    int s32NaluRemain;
    unsigned char u8NaluBytes;
    pSendBuf = (char *)calloc(MAX_RTP_PKT_LENGTH + 100, sizeof(char));  //#define MAX_RTP_PKT_LENGTH     1400
    if(NULL == pSendBuf)
    {
        s32Ret = -1;
        goto cleanup;
    }

    hRtp->pRtpFixedHdr = (StRtpFixedHdr *)pSendBuf;
    hRtp->pRtpFixedHdr->u7Payload   = H264;
    hRtp->pRtpFixedHdr->u2Version   = 2;
    hRtp->pRtpFixedHdr->u1Marker    = 0;
    hRtp->pRtpFixedHdr->u32SSrc     = hRtp->u32SSrc;
    //计算时间戳
    hRtp->pRtpFixedHdr->u32TimeStamp = htonl(hRtp->u32TimeStampCurr * (90000 / 1000));
    printf("timestamp:%lld\n",hRtp->u32TimeStampCurr);
    if(gettimeofday(&stTimeval, NULL) == -1)
    {
        printf("Failed to get os time\n");
        s32Ret = -1;
        goto cleanup;
    }

    //保存nalu首byte
    u8NaluBytes = *(pNalBuf+4);
    //设置未发送的Nalu数据指针位置
    pNaluCurr = pNalBuf + 5;
    //设置剩余的Nalu数据数量
    s32NaluRemain = s32NalBufSize - 5;
	if ((u8NaluBytes&0x1f)==0x7&&0)
	{
		printf("(u8NaluBytes&0x1f)==0x7&&0\r\n");
		pNaluPayload = (pSendBuf + 12);
		if(sps_len>0)
		{	        
	        memcpy(pNaluPayload, sps_tmp, sps_len);
	        if(sendto(hRtp->s32Sock, pSendBuf, sps_len+12, 0, (struct sockaddr *)&hRtp->stServAddr, sizeof(hRtp->stServAddr)) < 0)
	        {
	            s32Ret = -1;
	            goto cleanup;
	        }
		}
		if(pps_len>0)
		{	        
	        memcpy(pNaluPayload, pps_tmp, pps_len);
	        if(sendto(hRtp->s32Sock, pSendBuf, pps_len+12, 0, (struct sockaddr *)&hRtp->stServAddr, sizeof(hRtp->stServAddr)) < 0)
	        {
	            s32Ret = -1;
	            goto cleanup;
	        }
		}
	}
    //NALU包小于等于最大包长度，直接发送
    if(s32NaluRemain <= MAX_RTP_PKT_LENGTH)
    {
        hRtp->pRtpFixedHdr->u1Marker    = 1;
        hRtp->pRtpFixedHdr->u16SeqNum   = htons(hRtp->u16SeqNum ++);
        hRtp->pNaluHdr                  = (StNaluHdr *)(pSendBuf + 12);
        hRtp->pNaluHdr->u1F             = (u8NaluBytes & 0x80) >> 7;
        hRtp->pNaluHdr->u2Nri           = (u8NaluBytes & 0x60) >> 5;
        hRtp->pNaluHdr->u5Type          = u8NaluBytes & 0x1f;

        pNaluPayload = (pSendBuf + 13);
        memcpy(pNaluPayload, pNaluCurr, s32NaluRemain);

        s32Bytes = s32NaluRemain + 13;
		
		printf("<MAX_RTP_PKT_LENGTH----count=%d\r\n",s32Bytes);
		int i=0;
		printf("send data:0x");
		while(i<50)
		{
			printf("%x ",pSendBuf[i]);
			i++;
		}
		printf("......\r\n");
		
		fflush(stdout);
        if(sendto(hRtp->s32Sock, pSendBuf, s32Bytes, 0, (struct sockaddr *)&hRtp->stServAddr, sizeof(hRtp->stServAddr)) < 0)
        {
            s32Ret = -1;
            goto cleanup;
        }
#ifdef SAVE_NALU
        fwrite(pSendBuf, s32Bytes, 1, hRtp->pNaluFile);
#endif
    }
    //NALU包大于最大包长度，分批发送
    else
    {
        //指定fu indicator位置
        hRtp->pFuInd            = (StFuIndicator *)(pSendBuf + 12);
        hRtp->pFuInd->u1F       = (u8NaluBytes & 0x80) >> 7;
        hRtp->pFuInd->u2Nri     = (u8NaluBytes & 0x60) >> 5;
        hRtp->pFuInd->u5Type    = 28;

        //指定fu header位置
        hRtp->pFuHdr            = (StFuHdr *)(pSendBuf + 13);
        hRtp->pFuHdr->u1R       = 0;
        hRtp->pFuHdr->u5Type    = u8NaluBytes & 0x1f;

        //指定payload位置
        pNaluPayload = (pSendBuf + 14);

        //当剩余Nalu数据多于0时分批发送nalu数据
        while(s32NaluRemain > 0)
        {
            /*配置fixed header*/
            //每个包序号增1
            hRtp->pRtpFixedHdr->u16SeqNum = htons(hRtp->u16SeqNum ++);
            hRtp->pRtpFixedHdr->u1Marker = (s32NaluRemain <= MAX_RTP_PKT_LENGTH) ? 1 : 0;

            /*配置fu header*/
            //最后一批数据则置1
            hRtp->pFuHdr->u1E       = (s32NaluRemain <= MAX_RTP_PKT_LENGTH) ? 1 : 0;
			if(hRtp->pFuHdr->u1E==1)
				printf("***********the last data**************\r\n");
            //第一批数据则置1
            hRtp->pFuHdr->u1S       = (s32NaluRemain == (s32NalBufSize - 5)) ? 1 : 0;
			if(hRtp->pFuHdr->u1S==1)
				printf("***********the first data**************\r\n");


            s32Bytes = (s32NaluRemain < MAX_RTP_PKT_LENGTH) ? s32NaluRemain : MAX_RTP_PKT_LENGTH;


            memcpy(pNaluPayload, pNaluCurr, s32Bytes);

            //发送本批次
            s32Bytes = s32Bytes + 14;
			printf("fu ----count=%d\r\n",s32Bytes);
			int i=0;
			printf("send data:0x");
			while(i<50)
			{
				printf("%x ",pSendBuf[i]);
				i++;
			}
			printf("......\r\n");

			fflush(stdout);
            if(sendto(hRtp->s32Sock, pSendBuf, s32Bytes, 0, (struct sockaddr *)&hRtp->stServAddr, sizeof(hRtp->stServAddr)) < 0)
            {
                s32Ret = -1;
                goto cleanup;
            }
#ifdef SAVE_NALU
            fwrite(pSendBuf, s32Bytes, 1, hRtp->pNaluFile);
#endif

            //指向下批数据
            pNaluCurr += MAX_RTP_PKT_LENGTH;
            //计算剩余的nalu数据长度
            s32NaluRemain -= MAX_RTP_PKT_LENGTH;
        }
    }

cleanup:
    if(pSendBuf)
    {
        free((void *)pSendBuf);
    }
	printf("\n");
	fflush(stdout);

    return s32Ret;
}

Note:
A lot of printing information has been added to the above code to debug the detailed process of viewing H264 packets into RTP packets. In actual use, the above-mentioned printing information should be shielded, otherwise the thread of printing information will occupy a lot of time, which will cause the video to appear blurred.
Insert picture description here

Detailed analysis of H.264 packets and RTP packets

(1) h264 raw data analysis

For the specific h264 data format, it is recommended to check this article ( transport ) first to understand the H264 data structure.
Insert picture description here
00 00 00 01 67: 0x67&0x1f = 0x07: SPS
00 00 00 01 68: 0x68&0x1f = 0x08: PPS
00 00 00 01 06: 0x06&0x1f = 0x06: SEI information
00 00 00 01 65: 0x65&0x1f = 0x05: IDR Slice
00 00 00 01 61: 0x61&0x1f = 0x01: P frame

(2) RTP packet header data structure
Let's first look at the related structure

typedef struct _tagStRtpHandle
{
    int                 s32Sock;
    struct sockaddr_in  stServAddr;
    unsigned short      u16SeqNum;
    unsigned long long        u32TimeStampInc;
    unsigned long long        u32TimeStampCurr;
    unsigned long long      u32CurrTime;
    unsigned long long      u32PrevTime;
    unsigned int        u32SSrc;
    StRtpFixedHdr       *pRtpFixedHdr;   //rtp固定头，12个字节
    StNaluHdr           *pNaluHdr;    //nalu头，1个字节
    StFuIndicator       *pFuInd;     //fu分包，fu indicator
    StFuHdr             *pFuHdr;   //fu分包，fu header
    EmRtpPayload        emPayload;  //载荷类型
#ifdef SAVE_NALU
    FILE                *pNaluFile;
#endif
} StRtpObj, *HndRtp;

Among them, the StRtpFixedHdr structure is the fixed header of RTP, a total of 12 bytes (CSRC is ignored first), which corresponds to the structure in the figure below.
Insert picture description here

typedef struct
{
    /**//* byte 0 */
    unsigned char u4CSrcLen:4;      /**//* expect 0 */
    unsigned char u1Externsion:1;   /**//* expect 1, see RTP_OP below */
    unsigned char u1Padding:1;      /**//* expect 0 */
    unsigned char u2Version:2;      /**//* expect 2 */
    /**//* byte 1 */
    unsigned char u7Payload:7;      /**//* RTP_PAYLOAD_RTSP */
    unsigned char u1Marker:1;       /**//* expect 1 */
    /**//* bytes 2, 3 */
    unsigned short u16SeqNum;
    /**//* bytes 4-7 */
    unsigned long u32TimeStamp;
    /**//* bytes 8-11 */
    unsigned long u32SSrc;          /**//* stream number is used here. */
} StRtpFixedHdr;

The description of each identifier in the RTP header is as follows:

V: RTP protocol version number, occupying 2 digits, the current protocol version number is 2
P: Padding flag, occupying 1 bit. If P=1, fill one or more additional octets at the end of the message, which are not part of the payload.
X: extended flag, occupies 1 bit, if X=1, there is an extended header after the RTP header
CC: CSRC counter, 4 bits, indicating the number of CSRC identifiers
M: Mark, which occupies 1 bit. Different payloads have different meanings. For video, it marks the end of a frame; for audio, it marks the beginning of a session.
PT: Payload type, occupies 7 bits, used to describe the type of payload in the RTP message, such as GSM audio, JPEM image, etc. In streaming media, most of them are used to distinguish audio streams and video streams, which is convenient for customers Analyze at the end.
Sequence number: It occupies 16 bits and is used to identify the sequence number of the RTP message sent by the sender. Each time a message is sent, the sequence number increases by 1. This field can be used to check packet loss when the lower layer bearer protocol uses UDP and the network condition is not good. At the same time, network jitter can be used to reorder the data. The initial value of the sequence number is random, and the sequence of audio packets and video packets is counted separately.
Time stamp (Timestamp): occupies 32 bits and must use a 90 kHz clock frequency. The time stamp reflects the sampling moment of the first octet of the RTP message. The receiver uses the time stamp to calculate the delay and delay jitter, and perform synchronization control.
Synchronization source (SSRC) identifier: occupies 32 bits, used to identify the synchronization source. The identifier is randomly selected, and two synchronization sources participating in the same video conference cannot have the same SSRC.
Special source (CSRC) identifier: Each CSRC identifier occupies 32 bits, and there can be 0-15. Each CSRC identifies all special sources included in the payload of the RTP message.

Note: The basic RTP description does not define any header extension itself. If X=1, special processing is required
Insert picture description here
. As shown in the figure above, the 12 bytes marked in red are the RTP header.
Let’s analyze the 12 bytes in detail below: 0x80 60 0 0 18 37 6f 6e 16 1 a8 c0

0x80 is V_P_X_CC
60 is M_PT
00 00 is SequenceNum
18 37 6f 6e is Timestamp
16 1 a8 c0 is SSRC

Convert the first two bytes into binary as follows:
1000 0000 0110 0000,
in order, explained as follows:
10 is V;
0 is P;
0 is X;
0000 is CC;
0 is M; for video, mark the end of a frame, pay attention to the explanation later In the last packet, M will be set to 1, indicating the end of this frame.
110 0000 is PT;

00 00 is SequenceNum, you can find that you are increasing by one, increasing in turn;

18 37 6f 6e is Timestamp, the timestamps in the same rtp sub-packet are the same, and the timestamps of different rtp packets are different;

16 1 a8 c0 is SSRC, SSRC is set to its own ip address in the code. c0 is 192, a8 is 168, 1 is 1, 16 is 22, so the local address is 192.168.1.22. Note that this is the network byte order.

(2) Single NAL unit package

For packets whose NALU length is less than the MTU size, a single NAL unit mode is generally used. The packet structure is shown in the figure below.
Insert picture description here

In other words, when a single packet is sent, the actual data packet is:
12 bytes RTP header + 1 byte nalu (F, NRI, type) + the following valid data

The NALU header consists of one byte, and its syntax is as follows:
Insert picture description here
F : forbidden_zero_bit. 1 bit, if there is a syntax conflict, it is 1. When the network recognizes that this unit has a bit error, it can be set to 1, so that the receiver can discard the unit.

NRI : nal_ref_idc. 2 bits, used to indicate the importance level of the NALU. The larger the value, the more important the current NALU. There is no specific stipulation about the value when it is greater than 0.

Type : 5 bits, indicating the type of NALU. The details are shown in the table:
Insert picture description here
Note: The NALUs with NRI values of 7 and 8 are sequence parameter set (sps) and image parameter set (pps) , as mentioned above. The parameter set is a group of data that rarely changes and provides decoding information for a large number of VCL NALUs. The sequence parameter set acts on a series of consecutive encoded images, and the image parameter set acts on one or more independent images in the encoded video sequence. If the decoder fails to receive these two parameter sets correctly, other NALUs cannot be decoded . Therefore, they are generally sent before sending other NALUs, and use a different channel or a more reliable transmission protocol (such as TCP) for transmission, or they can be transmitted repeatedly.

(3) RTP sub-packet transmission

Fragmentation Units (FUs).

When the length of the NALU exceeds the MTU, the NALU unit must be fragmented and packaged. Also called Fragmentation Units (FUs).

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  | FU indicator  |   FU header   |                               |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
  |                                                               |
  |                         FU payload                            |
  |                                                               |
  |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                               :...OPTIONAL RTP padding        |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

  Figure 14.  RTP payload format for FU-A

The FU indicator octet has the following format:

  +---------------+
  |0|1|2|3|4|5|6|7|
  +-+-+-+-+-+-+-+-+
  |F|NRI|  Type   |
  +---------------+

The FU indicator indicates 28 of the byte type field, and 29 indicates FU-A and FU-B. The value of the NRI field must be set according to the value of the NRI field of the fragmented NAL unit. (The FU indicator Type here is the rtp fragmentation type, which is different from the type in the FU header ) See the following table:

Type Packet Type name

0 undefined -
1-23 NAL unit Single NAL unit packet per H.264
24 STAP-A Single-time aggregation packet
25 STAP-B Single-time aggregation packet
26 MTAP16 Multi-time aggregation packet
27 MTAP24 Multi-time aggregation packet
28 FU-A Fragmentation unit
29 FU-B Fragmentation unit
30-31 undefined

The FU header has the following format:

  +---------------+
  |0|1|2|3|4|5|6|7|
  +-+-+-+-+-+-+-+-+
  |S|E|R|  Type   |
  +---------------+

S: 1 bit When set to 1, the start bit indicates the start of the sliced NAL unit. When the following FU load is not the beginning of the fragmented NAL unit load, the start bit is set to 0.

E: 1 bit When set to 1, the end bit indicates the end of the fragmented NAL unit, that is, the last byte of the payload is also the last byte of the fragmented NAL unit.
When the following FU payload is not the last fragment of the fragmented NAL unit, the end bit is set to 0.

R: 1 bit
reserved bit must be set to 0, the receiver must ignore this bit.

Type: 5 bits
The Type here is the Type in the NALU header, taking the value from 1 to 23, which means the NAL unit load type definition

In other words, the data packet structure of RTP sub-packaging is:
12 bytes of RTP header + 1 byte of FU indicator (F, NRI, type) + 1 byte of FU header + the following valid data
combined with this The code of the example is explained.
Insert picture description here
It can be seen that the RTP in this example is always sent in packets, and there is no single NAL unit packet to send.
Look at the first sub-packet: the

12-byte RTP header has been parsed.
7c is the FU indicator, binary is 0111 1100, f is 0 (no syntax conflict), NRI is 11 (indicating important and cannot be discarded), and type is 11100. That is, 28, which is the FU-A sub-packet type. The detailed differences between FU-A and FU-B subcontracting have not been found for better information, please leave a message if you know it.
87 is FU header, binary is 1000 0111, S is 1 (representing start, the first packet), E is 0, R is 0, type is 111, which is 7, which represents the sequence parameter set (sps).
What follows is the effective data of multiple bytes, which is split from the original H264 data packet into multiple packets of effective data, as shown in the figure below.
Insert picture description here
Note that the FU header of the second packet is 7, the binary is 0000 0111, S is 0 (, E is 0, R is 0, type is 111, which is 7, which represents the sequence parameter set (sps).
Let’s look at the last one. Data packet data.

You can see that M_PT has changed from 60 to e0, that is, 0110 0000 becomes 1110 0000, that is, the highest bit M becomes 1, marking the end of a frame, indicating that it is the last packet.
FU header changed from 7 to 47, binary changed from 0000 0111 to 0100 0111, S is 0, E is 1 (representing end, the last packet), R is 0, type is 111, which is 7, representing the sequence parameter set ( sps).

(4) Correlation between multiple frames

Let's look at the picture of the multi-frame data packet.
Frame 1: Frame 2:
Insert picture description here
Frame 3:

Frame 4:

Frame

5, Frame 6, etc...

00 00 00 01 67:  0x67&0x1f = 0x07 :SPS
00 00 00 01 68:  0x68&0x1f = 0x08 :PPS
00 00 00 01 06:  0x06&0x1f = 0x06 :SEI信息
00 00 00 01 65:  0x65&0x1f = 0x05: IDR Slice
00 00 00 01 61:  0x61&0x1f = 0x01: P帧

As can be seen from the above, only the first frame type is 67 (SPS), and the first frame contains 68 (PPS), 0x06 (SEI information) and 0x65 (IDR Slice), and subsequent frames only contain 0x61 (P frame) )The data. Since the first frame contains SPS, PPS, SEI, and IDR slices, the amount of data is much more than that of the subsequent frames. This also reflects the concept of compression. The subsequent frames will be compared based on IDR frames to reduce the amount of data stored. , And then achieve compression.

(5) Detailed analysis of package source code

Knowing the above protocol knowledge, the code is analyzed in detail below, and some printing information has been removed.

static int SendNalu264(HndRtp hRtp, char *pNalBuf, int s32NalBufSize)	
{
    char *pNaluPayload;
    char *pSendBuf;
    int s32Bytes = 0;
    int s32Ret = 0;
    struct timeval stTimeval;
    char *pNaluCurr;
    int s32NaluRemain;
    unsigned char u8NaluBytes;
    pSendBuf = (char *)calloc(MAX_RTP_PKT_LENGTH + 100, sizeof(char));  // 分配一包数据的内存空间，MAX_RTP_PKT_LENGTH是1400，如果单包发送的最大是1413，即1400有效数据+13个字节，分包发送的话是1414，即1400有效数据+14个字节，这里直接分配1500字节的空间，留了几十字节的余量。
    //这里的1500便是MTU，Maximum Transmission Unit，最大传输单元，下面会介绍MTU
    if(NULL == pSendBuf)
    {
        s32Ret = -1;
        goto cleanup;
    }

    hRtp->pRtpFixedHdr = (StRtpFixedHdr *)pSendBuf; //指向分配的内存空间，往这部分内存空间里放数
    hRtp->pRtpFixedHdr->u7Payload   = H264;  //96,视频
    hRtp->pRtpFixedHdr->u2Version   = 2;  //版本2
    hRtp->pRtpFixedHdr->u1Marker    = 0;  //M=0，后面是最后一包时会置1
    hRtp->pRtpFixedHdr->u32SSrc     = hRtp->u32SSrc;   //在RtpCreate初始化函数中已经把hRtp->u32SSrc配置为本机的ip地址了
    //计算时间戳
    hRtp->pRtpFixedHdr->u32TimeStamp = htonl(hRtp->u32TimeStampCurr * (90000 / 1000));
    printf("timestamp:%lld\n",hRtp->u32TimeStampCurr);
    if(gettimeofday(&stTimeval, NULL) == -1)
    {
        printf("Failed to get os time\n");
        s32Ret = -1;
        goto cleanup;
    }

    //保存nalu首byte
    u8NaluBytes = *(pNalBuf+4);  //取出h264原始数据的第5个字节，即nalu头数据（F、NRI、type）
    //设置未发送的Nalu数据指针位置
    pNaluCurr = pNalBuf + 5;  //指向有效数据包
    //设置剩余的Nalu数据数量
    s32NaluRemain = s32NalBufSize - 5;
	if ((u8NaluBytes&0x1f)==0x7&&0)  //该分支屏蔽了，不调用
	{
		printf("(u8NaluBytes&0x1f)==0x7&&0\r\n");
		pNaluPayload = (pSendBuf + 12);
		if(sps_len>0)
		{	        
	        memcpy(pNaluPayload, sps_tmp, sps_len);
	        if(sendto(hRtp->s32Sock, pSendBuf, sps_len+12, 0, (struct sockaddr *)&hRtp->stServAddr, sizeof(hRtp->stServAddr)) < 0)
	        {
	            s32Ret = -1;
	            goto cleanup;
	        }
		}
		if(pps_len>0)
		{	        
	        memcpy(pNaluPayload, pps_tmp, pps_len);
	        if(sendto(hRtp->s32Sock, pSendBuf, pps_len+12, 0, (struct sockaddr *)&hRtp->stServAddr, sizeof(hRtp->stServAddr)) < 0)
	        {
	            s32Ret = -1;
	            goto cleanup;
	        }
		}
	}
    //NALU包小于等于最大包长度，直接发送
    if(s32NaluRemain <= MAX_RTP_PKT_LENGTH)
    {
        //12个字节RTP头   +     1个字节的nalu（F、NRI、type）    +     后面的有效数据
        
        hRtp->pRtpFixedHdr->u1Marker    = 1;  //由于是单包，所以这一包发送完了就没下一包了，M置1
        hRtp->pRtpFixedHdr->u16SeqNum   = htons(hRtp->u16SeqNum ++);  //序列号自加1
        hRtp->pNaluHdr                  = (StNaluHdr *)(pSendBuf + 12);  //RTP包数据的第13个字节地址
        hRtp->pNaluHdr->u1F             = (u8NaluBytes & 0x80) >> 7;  //F
        hRtp->pNaluHdr->u2Nri           = (u8NaluBytes & 0x60) >> 5;  //NRI
        hRtp->pNaluHdr->u5Type          = u8NaluBytes & 0x1f;   //type

        pNaluPayload = (pSendBuf + 13);
        memcpy(pNaluPayload, pNaluCurr, s32NaluRemain);  //把后面有效数据包拷贝过来
        s32Bytes = s32NaluRemain + 13;  //12个字节RTP头   +     1个字节的nalu（F、NRI、type）    +     后面的有效数据
        //以udp方式把pSendBuf指向的RTP数据包发出去
        if(sendto(hRtp->s32Sock, pSendBuf, s32Bytes, 0, (struct sockaddr *)&hRtp->stServAddr, sizeof(hRtp->stServAddr)) < 0)
        {
            s32Ret = -1;
            goto cleanup;
        }
#ifdef SAVE_NALU
        fwrite(pSendBuf, s32Bytes, 1, hRtp->pNaluFile);
#endif
    }
    //NALU包大于最大包长度，分批发送
    else
    {
//12个字节RTP头   +     1个字节的FU indicator（F、NRI、type） +  1个字节的FU  header +     后面的有效数据

        //指定fu indicator位置
        hRtp->pFuInd            = (StFuIndicator *)(pSendBuf + 12); //RTP包数据的第13个字节地址,fu indicator位置
        hRtp->pFuInd->u1F       = (u8NaluBytes & 0x80) >> 7;//F
        hRtp->pFuInd->u2Nri     = (u8NaluBytes & 0x60) >> 5;//NRI
        hRtp->pFuInd->u5Type    = 28;     //FU-A      Fragmentation unit  

        //指定fu header位置
        hRtp->pFuHdr            = (StFuHdr *)(pSendBuf + 13);  RTP包数据的第14个字节地址,即fu header位置
        hRtp->pFuHdr->u1R       = 0;  //R=0,后面会配置S  、 E
        hRtp->pFuHdr->u5Type    = u8NaluBytes & 0x1f;  //与Nalu头的type是一样的

        //指定payload位置
        pNaluPayload = (pSendBuf + 14);  //有效数据包

        //当剩余Nalu数据多于0时分批发送nalu数据
        while(s32NaluRemain > 0)
        {
            /*配置fixed header*/
            
            hRtp->pRtpFixedHdr->u16SeqNum = htons(hRtp->u16SeqNum ++);//每个包序号增1
            hRtp->pRtpFixedHdr->u1Marker = (s32NaluRemain <= MAX_RTP_PKT_LENGTH) ? 1 : 0;  //如果剩余字节小于一包的最大值，说明是最后一包，M置1，否则不是最后一包，置0

            /*配置fu header*/
            //最后一批数据则置1
            hRtp->pFuHdr->u1E       = (s32NaluRemain <= MAX_RTP_PKT_LENGTH) ? 1 : 0;//如果剩余字节小于一包的最大值，说明是最后一包，E置1，否则不是最后一包，置0
			//第一批数据则置1
            hRtp->pFuHdr->u1S       = (s32NaluRemain == (s32NalBufSize - 5)) ? 1 : 0;  //s32NaluRemain 等于 (s32NalBufSize - 5)，则说明是第一包数据，S置1
            s32Bytes = (s32NaluRemain < MAX_RTP_PKT_LENGTH) ? s32NaluRemain : MAX_RTP_PKT_LENGTH; // 确定一包的字节数，剩余字节小于一包的最大值，说明是最后一包，即一包的数据为剩余字节，否则是一包的最大值


            memcpy(pNaluPayload, pNaluCurr, s32Bytes);

            //发送本批次
            s32Bytes = s32Bytes + 14;  //有效数据加上前面的14个字节
            if(sendto(hRtp->s32Sock, pSendBuf, s32Bytes, 0, (struct sockaddr *)&hRtp->stServAddr, sizeof(hRtp->stServAddr)) < 0)
            {
                s32Ret = -1;
                goto cleanup;
            }
#ifdef SAVE_NALU
            fwrite(pSendBuf, s32Bytes, 1, hRtp->pNaluFile);
#endif

            //指向下批数据
            pNaluCurr += MAX_RTP_PKT_LENGTH;  //发完一包，指针移动一包的字节数
            //计算剩余的nalu数据长度
            s32NaluRemain -= MAX_RTP_PKT_LENGTH;//发完一包，剩余字节减去一包字节数
        }
    }

cleanup:
    if(pSendBuf)
    {
        free((void *)pSendBuf);
    }
	printf("\n");
	fflush(stdout);

    return s32Ret;
}

The above sub-packet size relates to MTU, the following is the MTU.
MTU Maximum Transmission Unit (Maximum Transmission Unit, MTU) refers to the largest data packet size (in bytes) that can pass through a certain layer of a communication protocol. The parameter of the maximum transmission unit is usually related to the communication interface (network interface card, serial port, etc.).
The following is an example from Baidu to illustrate MTU:
Because the length of the header and end of the protocol data unit is fixed, the larger the MTU, the longer the effective data carried by a protocol data unit, and the higher the communication efficiency. The larger the MTU, the lower the number of data packets required to transmit the same user data.
The MTU is not as large as possible, because the larger the MTU, the greater the delay in transmitting a data packet; and the larger the MTU, the greater the probability of bit errors in the data packet.
The larger the MTU, the higher the communication efficiency and the increase in transmission delay. Therefore, it is necessary to weigh the communication efficiency and transmission delay to choose a suitable MTU.
Take the Ethernet transmission of IPv4 packets as an example. The length indicated by the MTU includes the length of the IP header. If the length of the data message sent by the protocol layer above the IP layer exceeds the MTU, the data message will be fragmented at the IP layer of the sender and the IP layer of the receiver Reorganize the received fragments.
Here is a specific example to illustrate the principle of IP packet fragmentation. The MTU value of Ethernet is 1500 bytes. Assuming that the sender’s protocol upper layer sends a data message with a length of 3008 bytes to the IP layer, the total length of the IP packet after adding 20 bytes of IP header is 3028 bytes. Because 3028> 1500, the data message will be fragmented.
Note: Only the upper layer data is fragmented during fragmentation, and the original IP first part is not required. Therefore, the data length to be fragmented is only 3008. Not 3028. This is particularly error-prone.
The fragmentation process is as follows:

First calculate the length of the IP payload in the largest IP packet = MTU-IP header length = 1500-20 = 1480 bytes.
Then the 3008 bytes are divided into 3 pieces according to the length of 1480 bytes, 3008 = 1480+1480+48.
Finally, the sender will add IP headers to the 3 fragments to form 3 IP packets before sending. The lengths of the 3 IP packets are 1500 bytes, 1500 bytes, and 68 bytes.
From the fragmentation example above, it can be seen that the length of the IP packet composed of the first and second fragment packets is equal to the MTU, which is 1500 bytes.

Since then, the protocol format and source code of H.264 data packet encapsulation as RTP data packet have finally been analyzed. If there is an error, please also point it out.
Tired to death, the code word is not easy, take a break first, and continue to analyze other parts later, welcome to pay attention!

Reference:
http://www.iosxxx.com/blog/2017-08-09-Understanding H264 structure from scratch.html
https://blog.csdn.net/chen495810242/article/details/39207305
https://www. cnblogs.com/lidabo/p/4582040.html