rtp of webrtc for streaming media protocol analysis

1. sps and pps package formats:

void yang_getConfig_Meta_H264( YangSample* psps,  YangSample* ppps,uint8_t *configBuf,int32_t *p_configLen){
	//type_codec1 + avc_type + composition time + fix header + count of sps + len of sps + sps + count of pps + len of pps + pps
				// int32_t nb_payload = 1 + 1 + 3 + 5 + 1 + 2 + sps->size + 1 + 2 + pps->size;
	int32_t spsLen=psps->nb;
	int32_t ppsLen=ppps->nb;
	uint8_t* sps=(uint8_t*)psps->bytes;
	uint8_t* pps=(uint8_t*)ppps->bytes;
	configBuf[0] = 0x17;
	configBuf[1] = 0x00;
	configBuf[2] = 0x00;
	configBuf[3] = 0x00;
	configBuf[4] = 0x00;
	configBuf[5] = 0x01;
	configBuf[6] = sps[1];//0x42;
	configBuf[7] = sps[2];//0xC0;
	configBuf[8] = sps[3];//0x29;		//0x29;  //AVCLevelIndication1f
	configBuf[9] = 0xff;		//03;//ff;//0x03; AVCLevelIndication
	configBuf[10] = 0xe1;		//01;//e1;//01;numOfSequenceParameterSets
	uint8_t * szTmp = configBuf + 11;

	yang_put_be16((char*) szTmp, (uint16_t) spsLen);
	szTmp+=2;
    //*szTmp++=0x00;
   // *szTmp++=spsLen;

	memcpy(szTmp, sps, spsLen);
	szTmp += spsLen;
	*szTmp = 0x01;
	szTmp += 1;

	yang_put_be16((char*) szTmp, (uint16_t) ppsLen);
	szTmp+=2;
   // *szTmp++=0x00;
   // *szTmp++=ppsLen;
	memcpy(szTmp, pps, ppsLen);

	szTmp += ppsLen;
	*p_configLen = szTmp -  configBuf;
	szTmp = NULL;
}

 I and P frames: replace the four-byte start code with size, which is the size of I and P minus 4.

2. rtp data encapsulation

Package introduction:

 2.1 Structure of RTP header:

 

  •       V: The version number of the RTP protocol, accounting for 2 bits, the current protocol version number is 2
  •       P: padding flag, occupying 1 bit, if P=1, one or more extra octets are filled at the end of the message, which are not part of the payload.
  •       X: extension flag, occupying 1 bit, if X=1, then there is an extension header after the RTP header
  •       CC: CSRC counter, occupying 4 bits, indicating the number of CSRC identifiers
  •       M: 1bit, mark interpretation is defined by setting, the purpose is to allow important events to be marked in the packet flow. For example, different payloads have different meanings, for video, marking the end of a frame; for audio, marking the beginning of a session.
  •       Payload type (PT): 7bits
  •       Note: This payload type is defined in rfc for some earlier formats. But later, if h264 is not allocated, then use 96 instead. Therefore, above 96 does not indicate a specific format, and specifically indicates what needs to be negotiated with sdp or other protocols.
  •       Sequence number (SN): 16bits, used to identify the sequence number of the RTP message sent by the sender. Each time a message is sent, the sequence number increases by 1, and the initial value of the sequence number is randomly generated. Can be used to check packet loss and sort packets.
  •       Timestamp Timestamp: 32bits, must use 90kHz clock frequency.
  •       Synchronization source (SSRC) identifier: 32bits, used to identify the synchronization source. The identifier is randomly generated, and two synchronization sources participating in the same video conference cannot have the same SSRC.
  •       Special source (CSRC) identifier: each CSRC identifier occupies 32 bits, and there can be 0 to 15. Each CSRC identifies all contracted sources contained in the payload of the RTP message.

   2.2 The header FU-indicator format is. Occupies one byte:

     +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
  • F: It is stipulated in the specification that this bit must be 0.
  • NRI: Take 00~11 to indicate the importance of this NALU. For example, the NALU decoder of 00 can discard it without affecting the playback of the image. 
  • Type: The type of this NALU unit, as follows:

  2.3 Single NAL unit mode

  For packets whose NALU length is smaller than the MTU size, a single NAL unit mode is generally used.
  For an original H.264 NALU unit, it usually consists of three parts: [Start Code] [NALU Header] [NALU Payload], where the Start Code is used to indicate This is a

The beginning of the NALU unit must be "00 00 00 01" or "00 00 01". The NALU header is only one byte, followed by the content of the NALU unit.
 Remove "00 00 01" or "00 00 00 01" when packaging The start code, the RTP packet of other data packets can be.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |F|NRI|  type   |                                               |
      +-+-+-+-+-+-+-+-+                                               |
      |                                                               |
      |               Bytes 2..n of a Single NAL unit                 |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Example:
If there is an H.264 NALU like this:

[00 00 00 01 67 42 A0 1E 23 56 0E 2F... ]

This is a sequence parameter set NAL unit. [00 00 00 01] is the start code of four bytes, 67 is the NALU header, and the data starting at 42 is the NALU content.

Encapsulation into RTP packets will be as follows:

[ RTP Header ][FU-indicator] [ 67 42 A0 1E 23 56 0E 2F ]

That is, just remove the start code of 4 bytes.

2.4 Combined packet mode

  Secondly, when the length of NALU is very small, several NALU units can be encapsulated in one RTP packet.

   

When the length of NALU is very small, several NALU units can be encapsulated in one RTP packet. For example, for H264 SPS and PPS data, each data packet is preceded by a 2-bit length

2.5 Fragmentation unit:

  When the length of the NALU exceeds the MTU, the NALU unit must be fragmented. It is also called Fragmentation Units (FUs).
  
      Figure 14. RTP payload format for FU-A

   The FU indicator octet has the following format:


   

  • F: A value of 0 indicates that the NAL unit type octets and payload shall not contain bit errors or other syntax violations. A value of 1 indicates that the NAL unit type octets and payload may contain bit errors or other syntax violations.
  • NRI: unchanged from H.264 specification
  • Type: FU-A(28) or FU-B(29)

   The FU header has the following format:

  • S: Start bit, when the payload of the RTP packet is the first NAL unit fragment, this bit is 1, otherwise it is 0;
  • E: end bit, when the payload of the RTP packet is the last NAL unit fragment, this bit is 1, otherwise it is 0;
  • R: Reserved bits must be equal to 0
  • Type: The value is the Type in the H264 NALU Header.

 3. metartc packet

      yang_push_h264_video compares the data size with kRtpMaxPayloadSize, and does not exceed the single NAL unit mode. More than using fu_a packet mode.

int32_t yang_push_h264_video(void *psession, YangPushH264Rtp *rtp,
		YangFrame *videoFrame) {
	int32_t err = Yang_Ok;
	YangRtcSession *session=(YangRtcSession*)psession;
	if (videoFrame->nb <= kRtpMaxPayloadSize) {
		if ((err = yang_push_h264_package_single_nalu2(session, rtp, videoFrame))
				!= Yang_Ok) {
			return yang_error_wrap(err, "package single nalu");
		}
		session->context.stats.sendStats.videoRtpPacketCount++;
	} else {
		if ((err = yang_push_h264_package_fu_a(session, rtp, videoFrame,
				kRtpMaxPayloadSize)) != Yang_Ok) {
			return yang_error_wrap(err, "package fu-a");
		}
	}
	session->context.stats.sendStats.frameCount++;
	return err;
}

 Analyze the yang_push_h264_package_fu_a function.

int32_t yang_push_h264_package_fu_a(YangRtcSession *session, YangPushH264Rtp *rtp,
		YangFrame *videoFrame, int32_t fu_payload_size) {
	int32_t err = Yang_Ok;
	int32_t plen = videoFrame->nb;
	uint8_t *pdata = videoFrame->payload;
	char *p = (char*) pdata + 1;
	int32_t nb_left = plen - 1;
	uint8_t header = pdata[0];
	uint8_t nal_type = header & kNalTypeMask;


	int32_t num_of_packet = ((plen - 1) % fu_payload_size==0)?0:1 + (plen - 1) / fu_payload_size;
	for (int32_t i = 0; i < num_of_packet; ++i) {
		int32_t packet_size = yang_min(nb_left, fu_payload_size);

		yang_reset_rtpPacket(&rtp->videoFuaPacket);
		rtp->videoFuaPacket.header.payload_type = YangH264PayloadType;
		rtp->videoFuaPacket.header.ssrc = rtp->videoSsrc;
		rtp->videoFuaPacket.frame_type = YangFrameTypeVideo;
		rtp->videoFuaPacket.header.sequence = rtp->videoSeq++;
		rtp->videoFuaPacket.header.timestamp = videoFrame->pts;
		rtp->videoFuaPacket.header.marker = (i == num_of_packet - 1) ? 1 : 0;

		rtp->videoFuaPacket.payload_type = YangRtspPacketPayloadTypeFUA2;

		memset(&rtp->videoFua2Data, 0, sizeof(YangFua2H264Data));
		rtp->videoFua2Data.nri = (YangAvcNaluType) header;
		rtp->videoFua2Data.nalu_type = (YangAvcNaluType) nal_type;
		rtp->videoFua2Data.start = (i == 0) ? 1 : 0;
		rtp->videoFua2Data.end = (i == (num_of_packet - 1)) ? 1 : 0;

		rtp->videoFua2Data.payload = rtp->videoBuf;
		rtp->videoFua2Data.nb = packet_size;
		memcpy(rtp->videoFua2Data.payload, p, packet_size);

		p += packet_size;
		nb_left -= packet_size;
#if Yang_Using_TWCC
		if(i==0){
			rtp->rtpExtension.twcc.sn=rtp->twccSeq++ ;
			rtp->videoFuaPacket.header.extensions=&rtp->rtpExtension;
			session->context.twcc.insertLocal(&session->context.twcc.session,rtp->rtpExtension.twcc.sn);
		}
#endif
		if ((err = yang_push_h264_encodeVideo(session, rtp, &rtp->videoFuaPacket))
				!= Yang_Ok) {
			return yang_error_wrap(err, "encode packet");
		}
		rtp->videoFuaPacket.header.extensions=NULL;

	}

	return err;
}

 Function data: yang_encode_h264_fua2 packet encapsulation and yang_encode_h264_raw single NAL encapsulation.

int32_t yang_push_h264_encodeVideo(YangRtcSession *session, YangPushH264Rtp *rtp,
		YangRtpPacket *pkt) {
	int err = 0;

	yang_init_buffer(&rtp->buf, yang_get_rtpBuffer(rtp->videoRtpBuffer),	kRtpPacketSize);

	if ((err = yang_encode_rtpHeader(&rtp->buf, &pkt->header)) != Yang_Ok) {
		return yang_error_wrap(err, "rtp header(%d) encode packet fail",
				pkt->payload_type);
	}

	if (pkt->payload_type == YangRtspPacketPayloadTypeRaw) {
		err = yang_encode_h264_raw(&rtp->buf, &rtp->videoRawData);
	} else if (pkt->payload_type == YangRtspPacketPayloadTypeFUA2) {
		err = yang_encode_h264_fua2(&rtp->buf, &rtp->videoFua2Data);

	} else if (pkt->payload_type == YangRtspPacketPayloadTypeSTAP) {
		err = yang_encode_h264_stap(&rtp->buf, &rtp->stapData);
		yang_reset_h2645_stap(&rtp->stapData);
	}

	if (err != Yang_Ok) {
		return yang_error_wrap(err, "rtp payload(%d) encode packet fail",
				pkt->payload_type);
	}
	if (pkt->header.padding_length > 0) {
		uint8_t padding = pkt->header.padding_length;
		if (!yang_buffer_require(&rtp->buf, padding)) {
			return yang_error_wrap(ERROR_RTC_RTP_MUXER,
					"padding requires %d bytes", padding);
		}
		memset(rtp->buf.head, padding, padding);
		yang_buffer_skip(&rtp->buf, padding);
	}
	session->context.stats.on_pub_videoRtp(&session->context.stats.sendStats,pkt,&rtp->buf);
	return yang_send_avpacket(session, pkt, &rtp->buf);
}

 yang_encode_h264_fua2 composes rtp data packets;

int32_t yang_encode_h264_fua2(YangBuffer* buf,YangFua2H264Data* pkt){
	if (!yang_buffer_require(buf,2 + pkt->nb)) {
			return yang_error_wrap(ERROR_RTC_RTP_MUXER, "requires %d bytes", 1);
		}
		char *p = buf->head;   // rtp header 数据包

		uint8_t fu_indicate = kFuA;
		fu_indicate |= (pkt->nri & (~kNalTypeMask));
		*p++ = fu_indicate;   // 一个字节fu_indicate 
 
		// FU header, @see https://tools.ietf.org/html/rfc6184#section-5.8
		uint8_t fu_header = pkt->nalu_type;
		if (pkt->start) {
			fu_header |= kStart; 
		}
		if (pkt->end) {
			fu_header |= kEnd;
		}
		*p++ = fu_header;    // 一个字节fu_header,表示分包,开始-》中间-》结束

		// FU payload, @see https://tools.ietf.org/html/rfc6184#section-5.8
		memcpy(p, pkt->payload, pkt->nb);    // 数据部分
  
		// Consume bytes.
		yang_buffer_skip(buf,2 + pkt->nb);

		return Yang_Ok;
}

  

Guess you like

Origin blog.csdn.net/u012794472/article/details/126830332