RTMP handshake protocol and lal RTMP handshake implementation analysis

Table of contents

insert image description here

1. RTMP handshake analysis

  1. In order to minimize the number of communications while ensuring the authentication function of the handshake, the RTMP handshake sending sequence is generally:
|client|Server |
|---C0+C1---->|<--S0+S1+S2-- |
|---C2---->
  1. RTMP handshakes are divided into simple handshakes and complex handshakes.

1. Simple handshake

1. c0 and s0

  1. version: version number, fixed to 0x03
  2. The structure is as follows:

image.png

2. c1 and s1

  1. times: 4 bytes, including a timestamp, which is used for the time starting point of all subsequent blocks sent by this terminal.
    1. The 4-byte timestamp is generally in milliseconds, and this value can also be 0 or any value.
  2. zero: 4 bytes, the field content is all 0.
    1. Through the 4-byte binary string with all 0s, the server can determine whether the client is using the simple mode.
  3. random-bytes: 1528 bytes, can be any content.
    1. The terminal needs to distinguish whether the response comes from the handshake initiated or the handshake initiated by the peer.
    2. The content is random and does not need to be encrypted.
  4. When nginx-rtmp-module is used as a client, the time in c1 uses the millisecond part of the current unix timestamp. When nginx-rtmp-module is used as the server, if the client is judged to be in simple mode, the timestamp in c1 is not used after parsing the timestamp. When sending s1, the 1536 bytes of c1 are returned as they are.
  5. The structure is as follows:

image.png

3. c2 and s2

  1. time: 4 bytes, indicating the timestamp sent by the peer.
    1. The time of c2 should be set to the time field in s1.
    2. The time of s2 should be set to the time field in c1.
  2. time2: 4 bytes, indicating the timestamp sent by the receiving peer.
    1. The time2 of c2 should be set to the time when s1 is received.
    2. The time2 of s2 should be set to the time when c1 is received.
  3. random-bytes: 1528 bytes, representing the random data sent by the peer.
    1. The random-bytes of c2 should be set to the random-bytes received from s1.
    2. The random-bytes of s2 should be set to the random-bytes received from c1.
  4. The obs client (obs uses the simple handshake mode) and the nginx-rtmp-module server handshake, and the entire 1536 bytes of c1, c2, s1, and s2 are exactly the same.
    1. Explaining the fields of time and time2, nginx-rtmp-module did not do exactly what the document said.
  5. The structure is as follows:

image.png

2. Complex handshake

1. hmac-sha256

  1. hmac-sha256 algorithm, complex mode will use it to do some signature calculation and verification.
  2. The input of this algorithm is a key (the length can be arbitrary) and an input string (the length can be arbitrary), and a 32-byte signature string is obtained after hmac-sha256 operation.
  3. When the key and input are fixed, the hmac-sha256 calculation result is also fixed.

2. c0 and s0

  1. version: version number, fixed to 0x03

3. c1 and s1

  1. The complex handshake divides c1 and s1 into four parts, which are divided into two ways according to the positions of key and digest.
  2. The first way:
#schema0
time: 4bytes
version: 4bytes
key: 764bytes
digest: 764bytes
  1. The second way:
#schema1
time: 4bytes
version: 4bytes
digest: 764bytes
key: 764bytes
  1. Among the two schemas, which one to use is determined by the client. The server first parses according to schema0, and then parses according to schema1 if it fails.
  2. The structure is parsed as follows:
  3. time: 4 bytes, same as simple mode, ffmpeg uses [0, 0, 0, 0]
    1. The 4-byte timestamp is generally in milliseconds, and this value can also be 0 or any value.
  4. version: 4 bytes, version number, nginx-rtmp-module uses [0x0C, 0x00, 0x0D, 0x0E]. ffmpeg is using [9, 0, 124, 2]
  5. key: 764 bytes, the structure is as follows:
random-data: (offset) bytes
key-data: 128 bytes
random-data: (764 - offset - 128 - 4) bytes
offset: 4 bytes
  1. digest: 764 bytes, the structure is as follows:
offset: 4 bytes
random-data: (offset) bytes
digest-data: 32 bytes
random-data: (764 - 4 - offset - 32) bytes

4. c2 and s2

  1. c2 and s2 are mainly used to verify S1 and C1, and the length is also 1536 bytes.
  2. The structure is as follows:
random-data: 1504bytes
digest-data: 32bytes
  1. Both random-data and digest-data should come from the corresponding data.

5. Digest related

1. digest location
  1. The c1 and s1 structures are divided into two formats.
  2. The first way:
#schema0
time: 4bytes
version: 4bytes
key: 764bytes
digest: 764bytes
  1. The second way:
#schema1
time: 4bytes
version: 4bytes
digest: 764bytes
key: 764bytes
  1. The position of the digest can be in the first half or in the second half.
  2. When the position of the digest is in the first half, the position information (offset) of the digest is saved at the start position of the first half.
  3. The c1 format expands as follows:
| 4字节time | 4字节版本号 | 4字节offset | left[...] | 32字节digest | right[...] | 后半部分764字节 |
// 取余728是因为前半部分的764字节要减去offset字段的4字节,再减去digest的32字节
// 12是因为要跳过4字节time + 4字节版本号 + 4字节offset
offset = (c1[8] + c1[9] + c1[10] + c1[11]) % 728 + 12
  1. The value range of offset is [12,740).
  2. When the digest is in the second half, the offset is saved at the starting position of the second half.
  3. The c1 format expands as follows:
| 4字节time | 4字节模式串 | 前半部分764字节 | 4字节offset | left[...] | 32字节digest | right[...] |
// 取余728是因为后半部分的764字节要减去offset字段的4字节,再减去digest的32字节
// +8+764+4是因为要跳过4字节time + 4字节版本号 + 前半部分764字节 + 4字节offset
offset = (c1[8+764] + c1[8+764+1] + c1[8+764+2] + c1[8+764+3]) % 728 + 8 + 764 + 4
  1. The value range of offset is [776,1504)
2. Digest generation
  1. Rules for generating complex binary strings of 1528 bytes in c1 and c2:
    1. Randomize the 1528-byte complex binary string.
    2. Write a 32-byte digest signature in a 1528-byte random binary string.
  2. See code analysis for details.

2. Implementation of RTMP handshake in lal

  1. lalserver is a streaming media (live audio and video network transmission) server developed purely in Golang. Currently, RTMP, RTSP(RTP/RTCP), HLS, HTTP[S]/WebSocket[S]-FLV/TS, GB28181 protocols are supported. It also supports secondary development and expansion in the form of plug-ins.
    insert image description here

  2. lal github address: https://github.com/q191201771/lal

  3. Lal official document address: lal official document

  4. Every time an rtmp connection comes in lal, a coroutine will be opened to receive it, and the (s *ServerSession) handshake() function is responsible for the rtmp handshake.

  5. For related principles, refer to the above introduction to RTMP handshake analysis.

func (s *ServerSession) handshake() error {
    
    
	if err := s.hs.ReadC0C1(s.conn); err != nil {
    
    
		return err
	}
	Log.Infof("[%s] < R Handshake C0+C1.", s.UniqueKey())

	Log.Infof("[%s] > W Handshake S0+S1+S2.", s.UniqueKey())
	if err := s.hs.WriteS0S1S2(s.conn); err != nil {
    
    
		return err
	}

	if err := s.hs.ReadC2(s.conn); err != nil {
    
    
		return err
	}
	Log.Infof("[%s] < R Handshake C2.", s.UniqueKey())
	return nil
}

1. The server receives the c0 and c1 chunks sent by the client

1. ReadC0C1(reader io.Reader) function analysis

  1. The ReadC0C1(reader io.Reader) function will parse the c0c1 chunk and generate new digest-data. If the digest-data length is not 0, it means that it is a complex handshake, otherwise it is a simple handshake. At the same time, it will construct s0, s1, and s2 chunks to be sent later.
  2. After judging the simple handshake or the complex handshake, start to construct s0, s1, s2.
  3. s0 occupies 1 byte, indicating the version number, and the value is 3.
  4. s1 occupies 1536 bytes, 4 bytes time, in addition:
    1. For a simple handshake:
      1. The next 4 bytes are zero, all 0
      2. The last 1528 bytes are random-bytes, add random characters
    2. For a complex handshake:
      1. The next 4 bytes version, the content is: 0x0D, 0x0E, 0x0A, 0x0D
      2. The last 1528 bytes are random-bytes, which are first filled with random characters, and then according to the digest key structure, the offset of the digest is obtained, and new digest-data is generated according to the 36-byte fixed key to replace the original digest-data.
  5. s2 occupies 1536 bytes and is divided into simple mode and complex mode:
    1. For simple mode:
      1. s2 copies the contents of c1 in the c0c1 chunk
    2. For complex mode:
      1. s2 is first filled with random-bytes of 1528 bytes, and then according to the new digest-data generated by the parseChallenge() function as a key, new digest-data is generated and filled to the end 32 bytes of s2.
  6. code show as below:
func (s *HandshakeServer) ReadC0C1(reader io.Reader) (err error) {
    
    
	c0c1 := make([]byte, c0c1Len)
	if _, err = io.ReadAtLeast(reader, c0c1, c0c1Len); err != nil {
    
     // 读取c0c1Len(1+1536)字节
		return err
	}

	s.s0s1s2 = make([]byte, s0s1s2Len) // 用于存储s0s1s2

	s2key := parseChallenge(c0c1, clientKey[:clientPartKeyLen], serverKey[:serverFullKeyLen]) //解析c0c1或者s0s1
	s.isSampleMode = len(s2key) == 0                                                          // s2key长度不为0,说明是复杂握手

	s.s0s1s2[0] = version

	s1 := s.s0s1s2[1:]
	s2 := s.s0s1s2[s0s1Len:]

	bele.BePutUint32(s1, uint32(time.Now().UnixNano())) // s1添加时间戳
	random1528(s1[8:])                                  // 填充1528字节随机字符

	if s.isSampleMode {
    
    
		//s1
		bele.BePutUint32(s1[4:], 0) // 简单握手zero4字节内容都为0

		copy(s2, c0c1[1:]) // s2复制c1内容
	} else {
    
    
		//s1
		copy(s1[4:], serverVersion) // s1添加version,服务端为:0x0D, 0x0E, 0x0A, 0x0D

		offs := int(s1[8]) + int(s1[9]) + int(s1[10]) + int(s1[11]) // s1使用 digest key结构,获取offset
		offs = (offs % 728) + 12                                    // 12 = 4字节time+4字节模式串+4字节offset
		// 填充s1内容,key为36字节固定key,生成新的32字节的digest-data填入到s1
		makeDigestWithoutCenterPart(s.s0s1s2[1:s0s1Len], offs, serverKey[:serverPartKeyLen], s.s0s1s2[1+offs:])

		// s2
		// make digest to s2 suffix position
		random1528(s2)

		replyOffs := s2Len - keyLen
		makeDigestWithoutCenterPart(s2, replyOffs, s2key, s2[replyOffs:]) // 将digest-data填充到s2的后32字节
	}
	return nil
}

2. parseChallenge(b []byte, peerKey []byte, key []byte) function analysis

  1. The parseChallenge() function is used to parse c0c1 (also can parse s0s1)
  2. First, the big-endian mode will obtain the 5th to 9th bytes of the c0c1 chunk as the version. If the version is 0, it means that it is a simple mode, otherwise it is a complex mode.
  3. For complex modes, use the findDigest() function to first search for the start subscript offs of digest-data according to the schema0 format (ie, the key digest structure). If not found, search for the digest-data according to the schema1 format (ie, the digest key structure). The starting subscript offs.
  4. After finding the start subscript offset of the digest-data of c1, use the first 36 bytes of the serverKey as the key and digest-data as the input to generate a new digest-data and return it.
  5. The content of serverKey is:
// 36+32
var serverKey = []byte{
    
    
	'G', 'e', 'n', 'u', 'i', 'n', 'e', ' ', 'A', 'd', 'o', 'b', 'e', ' ',
	'F', 'l', 'a', 's', 'h', ' ', 'M', 'e', 'd', 'i', 'a', ' ',
	'S', 'e', 'r', 'v', 'e', 'r', ' ',
	'0', '0', '1',

	0xF0, 0xEE, 0xC2, 0x4A, 0x80, 0x68, 0xBE, 0xE8, 0x2E, 0x00, 0xD0, 0xD1,
	0x02, 0x9E, 0x7E, 0x57, 0x6E, 0xEC, 0x5D, 0x2D, 0x29, 0x80, 0x6F, 0xAB,
	0x93, 0xB8, 0xE6, 0x36, 0xCF, 0xEB, 0x31, 0xAE,
}
  1. code show as below:
// c0c1 clientPartKey serverFullKey
// s0s1 serverPartKey clientFullKey
func parseChallenge(b []byte, peerKey []byte, key []byte) []byte {
    
    
	//if b[0] != version {
    
    
	//	return nil, ErrRtmp
	//}

	ver := bele.BeUint32(b[5:]) // 大端,从下标5(c0 1字节 + c1 time4字节)开始读取4个字节,获取ver
	if ver == 0 {
    
                   // 如果ver = 0,说明是简单模式,复杂模式的ver不等于0
		Log.Debug("handshake simple mode.")
		return nil
	}

	offs := findDigest(b[1:], 764+8, peerKey) // 按照key digest结构进行查找digest-data的下标,time + version + key = 4 + 4 + 764
	if offs == -1 {
    
    
		offs = findDigest(b[1:], 8, peerKey) // 按照digest key结构进行查找
	}
	if offs == -1 {
    
    
		Log.Warn("get digest offs failed. roll back to try simple handshake.")
		return nil
	}
	Log.Debug("handshake complex mode.")

	// use c0c1 digest to make a new digest
	digest := makeDigest(b[1+offs:1+offs+keyLen], key) // 使用本端key和c0c1的digest-data生成新的digest-data并返回

	return digest
}

3. findDigest(b []byte, base int, key []byte) function analysis

  1. The findDigest() function is used to find the starting position offs of the digest-data of c1 or s1.
  2. The parameter base is the starting position according to the key digest structure or the offset of the digest in the digest key structure, see schema0 and schema1 structures.
    1. In the key digest structure, base is 764+8 (4 bytes time + 4 bytes version + 764 bytes key)
      1. digest offset = c1[8+764] + c1[8+764+1] + c1[8+764+2] + c1[8+764+3] + base + 4
    2. In the digest key structure, base is 8 (4 bytes time + 4 bytes version)
      1. digest offset = (c1[8] + c1[9] + c1[10] + c1[11]) + base + 4
  3. After obtaining digest offs, the first 30 bytes of clientKey are used as the key, the left part of offs is spliced ​​with offs+len(digest[32 bytes]) and the right part is used as the input of hmac-sha256 to generate 32 bytes digest-data.
  4. The content of clientKey is:
// 30+32
var clientKey = []byte{
    
    
	'G', 'e', 'n', 'u', 'i', 'n', 'e', ' ', 'A', 'd', 'o', 'b', 'e', ' ',
	'F', 'l', 'a', 's', 'h', ' ', 'P', 'l', 'a', 'y', 'e', 'r', ' ',
	'0', '0', '1',

	0xF0, 0xEE, 0xC2, 0x4A, 0x80, 0x68, 0xBE, 0xE8, 0x2E, 0x00, 0xD0, 0xD1,
	0x02, 0x9E, 0x7E, 0x57, 0x6E, 0xEC, 0x5D, 0x2D, 0x29, 0x80, 0x6F, 0xAB,
	0x93, 0xB8, 0xE6, 0x36, 0xCF, 0xEB, 0x31, 0xAE,
}
  1. Compare the generated digest-data with the original digest-data, if they are the same, it means that it is a complex pattern.
  2. code show as below:
func findDigest(b []byte, base int, key []byte) int {
    
    
	// calc offs 
	offs := int(b[base]) + int(b[base+1]) + int(b[base+2]) + int(b[base+3]) // digest offset
	offs = (offs % 728) + base + 4                                          // offset + base + len(offset),移动offs到digest-data
	// calc digest 
	digest := make([]byte, keyLen)                    // digest-data为keyLen(32)字节
	makeDigestWithoutCenterPart(b, offs, key, digest) // digest-data生成
	// compare origin digest in buffer with calced digest
	if bytes.Compare(digest, b[offs:offs+keyLen]) == 0 {
    
     // 比较原digest和计算得到的digest是否相同,相同则为复杂模式
		return offs
	}
	return -1
}

4. makeDigestWithoutCenterPart(b []byte, offs int, key []byte, out []byte) function analysis

  1. The makeDigestWithoutCenterPart() function is used to concatenate the left part and right part of digest-data as the input for hmac-sha256, and generate a new digest according to the given key.
func makeDigestWithoutCenterPart(b []byte, offs int, key []byte, out []byte) {
    
    
	mac := hmac.New(sha256.New, key) // 30字节固定key作为hmac-sha256的key
	//c1 digest左边部分拼接上c1 digest右边部分(如果右边部分存在的话)作为hmac-sha256的input(整个大小是1536-32)
	// left
	if offs != 0 {
    
    
		mac.Write(b[:offs])
	}
	// right
	if len(b)-offs-keyLen > 0 {
    
     // digest的random data部分
		mac.Write(b[offs+keyLen:])
	}
	// calc
	copy(out, mac.Sum(nil)) // hmac-sha256计算得出32字节的digest填入c1中digest字段中
}

2. The server sends s0, s1, s2 chunks to the client

1. WriteS0S1S2(write io.Writer) function analysis

  1. The WriteS0S1S2() function sends s0, s1, s2 chunks to the client.
func (s *HandshakeServer) WriteS0S1S2(write io.Writer) error {
    
    
	_, err := write.Write(s.s0s1s2)
	return err
}

3. The server receives the c2 chunk sent by the client

1. ReadC2(conn io.Reader) function analysis

  1. The ReadC2() function will read c2Len bytes, and the reading is complete without error reporting.
func (s *HandshakeServer) ReadC2(conn io.Reader) error {
    
    
	c2 := make([]byte, c2Len)
	if _, err := io.ReadAtLeast(conn, c2, c2Len); err != nil {
    
    
		return err
	}
	return nil
}

Reference:
lal official document - simple mode and complex mode of rtmp handshake handshake

Guess you like

Origin blog.csdn.net/weixin_41910694/article/details/127406703