WebRTC data security

When A establishes a connection with B, A will come with its username and password. At this time, B can judge whether A is a member by verifying whether the username and password brought by A are consistent with the username and password in SDP. a legitimate user.

Fingerprint is also a key step in verifying the legitimacy. It stores the fingerprint (or information digest) of the public key certificate. In addition to verifying the legitimacy of the user through ice-ufrag and ice-pwd, it also needs to verify the certificate it sends , to see if the certificate has been tampered with during transmission.

As you can see from this picture, A and B need to go through the following steps before transferring data.

Firstly, SDP information is exchanged through the signaling server, that is, media negotiation is performed.

The user's user name, password and fingerprint are recorded in the SDP, and the user's identity can be confirmed with this information.

Immediately afterwards, A conducts identity authentication through the STUN protocol (the bottom layer uses the UDP protocol).

If the username and password in the STUN message match the username and password in the exchanged SDP, then it is a legitimate user.

After confirming that the user is a legitimate user, DTLS negotiation is required to exchange public key certificates and negotiate password-related information.

At the same time, the certificate must be verified by fingerprint to confirm that it has not been tampered with during transmission.

Finally, use the negotiated password information and public key to encrypt the data, and start transmitting audio and video data.

We said earlier that WebRTC achieves data security by using a combination of several protocols such as DTLS and SRTP. Next, let's take a look at how these protocols are implemented.

The TLS protocol consists of the TLS record protocol and the TLS handshake protocol:

TLS record protocol, used for data encryption, data integrity detection, etc.;

The TLS handshake protocol is mainly used for key exchange and identity confirmation.

Since the bottom layer of TLS is based on the TCP protocol, and the transmission of WebRTC audio and video data is mainly based on the UDP protocol, WebRTC cannot directly use the TLS protocol for data protection. But the TLS protocol is really perfect in terms of data security, so people think whether it is possible to port the TLS protocol to the UDP protocol? So DTLS came into being.

So you can think of DTLS as a simplified version of TLS running on top of the UDP protocol, and the security mechanism it uses is almost exactly the same as TLS.

In order to protect audio and video data more effectively in WebRTC, it is necessary to use the DTLS protocol to exchange public key certificates and confirm the cryptographic algorithm used. This process is called the handshake protocol in the DTLS protocol.

The handshake process of DTLS is as follows:

First, the DTLS protocol adopts the C/S mode for communication, in which the end that initiates the request is the client, and the end that receives the request is the server.

The client sends a ClientHello message to the server. After receiving the request, the server returns a ServerHello message, sends its own certificate to the client, and requests the client certificate at the same time.

After the client receives the certificate, it sends its own certificate to the server and asks the server to confirm the encryption algorithm.

After the server confirms the encryption algorithm, it sends a Finished message, and the handshake ends.

After the DTLS handshake is completed, the communicating parties can begin to send audio and video data to each other.

In WebRTC, in order to prevent such things from happening, the RTP/RTCP protocol is not used directly, but the SRTP/SRTCP protocol is used, that is, the secure RTP/RTCP protocol.

WebRTC uses the very famous libsrtp library to convert the original RTP/RTCP protocol data into SRTP/SRTCP protocol data. The use of libsrtp is very simple, and the specific steps can be summarized as follows.

The first step is to initialize libsrtp.

The second step is to create a Session. Creating a Session is a little more complicated. In this process, you need to specify the creation strategy, such as which algorithm to use for content integrity detection, what is the public key for decoding, and so on.

The third step is to encrypt the RTP packet.

The fourth step is to decrypt the SRTP packet.

The fifth and final step is to release resources.

Guess you like

Origin blog.csdn.net/Doubao93/article/details/123444530