Understanding WebRTC Connectivity

WebRTC connectivity is a rather complex domain, Netscan tries to simplify the whole process down to a few attributes. First, let’s have a quick look at how WebRTC connectivity is established.

Let’s break down the terminology real quick:

  • STUN Server A STUN server performs a very simple job, it tells the client what its actual IP address is and determines if the client is behind a Network Address Translation (NAT) service. It’s the first building block of enabling a client to establish a WebRTC connection. Using a STUN server we can infer if the client accepts inbound connections from other clients on the Internet.
  • TURN Server A TURN server provides a fallback solution for clients that cannot establish a Peer to Peer connection and acts as a media server proxy between the two peers. Naturally this consumes a lot of resources so TURN Servers are pretty expensive machines. Using a TURN server we can infer if the client can perform outbound connections to random ports. Netscan employs its own TURN server to perform WebRTC connectivity tests but you can use your own for more accurate results.
  • ICE Candidate ICE Candidates are produced by the client during the ICE negotiation phase, their purpose is to indicate what type of connectivity has been discovered by running connectivity tests using the STUN and TURN servers.
  • Signal Channel The Signal Channel communicates connectivity between the two peers, the phase during a WebRTC call handshake that this happens is called the Signaling. The Signal Channel is a custom implementation that you have to do, typically using WebSockets. All ICE Candidates are shared during the signaling phase as well as an SDP offer which includes the client’s capabilities in encoding and decoding audio and video streams.

Figure 1.

WebRTC Connectivity complete diagram Image courtesy of Mozilla Developer Network.

WebRTC Connectivity Discovery

So what happened here, let’s take it from the start. For two peers to connect together they need a middle server to act as the Signaling Channel during the signaling phase. Here’s how a connectivity handshake goes down from Peer A’s perspective.

  1. Starts ICE Connectivity tests with STUN server, figures out its IP and NAT status.
  2. Communicates audio and video codec capabilities.
  3. Receives audio and video codec capabilities from Peer B.
  4. Starts ICE Connectivity tests with TURN server.
  5. As ICE Candidates are produced from the connectivity tests they are communicated to Peer B.
  6. Receives ICE Candidates produced by Peer B
  7. Figures out the best solution to establish a connection with Peer B and attempts it.

ICE Candidates

Our of this handshake ICE Candidates are generated, these candidates contain critical connectivity information that we leverage to infer the client’s connectivity status. Candidates come in objects containing information about the medium concerned (audio/video) and the actual candidate content, like so:

candidate”: “candidate:2795255774 1 udp 2122260223 192.168.1.7 53196 typ host generation 0”

medium”: “video”

This is a single Candidate, during the ICE negotiation there are many produced, expect 8 to 15. We have highlighted some important parts of the candidate, from which we can infer that this is a UDP candidate using the IP 192.168.1.7 which is an IP only used in local networks, thus we can safely assume that the client is behind a NAT.

The typ host part indicates the candidate type, we can have 3 types, the typ hostwhich represents network interfaces of the client, the typ srflx which is produced by the STUN Server and represents inbound discovered connectivity and the typ relay which is produced by the TURN Server and represents outbound connectivity.

By studying the generated candidates and accounting for the ones that are missing we can reliably infer the inbound and outbound TCP and UDP connectivity of the client.

What Ports does WebRTC Use?

Well, here’s the thing, WebRTC doesn’t use any particular ports. The ports are randomly selected by the Peers during the ICE Negotiation phase, so they can be anything from 1024 upwards to 65535.

So there really isn’t any port to target. And this is the reason why networks behind firewalls are very hard cases to establish a WebRTC connection with.

How Netscan Detects WebRTC Connectivity

By employing STUN and TURN servers we initiate a WebRTC connection on the client and collect all the generated Candidates. Each candidate is analyzed and turns on one of the following switches:

  • Inbound TCP The client accepts inbound TCP connections. This can be inferred by a STUN candidate on the Internet IP of the client. For a client to accept incoming TCP connections is has to have an actual Internet Address and not be behind a NAT router.
  • Inbound UDP The client accepts inbound UDP connections. Again, this can be inferred by a STUN candidate. UDP inbound connectivity can happen event if the client is behind a NAT router, due to the stateless nature of the UDP protocol. This is the single fact that enables NAT traversal and all the home based clients to perform a WebRTC connection.
  • Outbound TCP The client can perform outbound TCP connections on ports other than those of the web and secure web (80 & 443). This can be inferred by TURN candidates and performing XHR connections on high ports.
  • Outbound UDP The client can perform outbound UDP connections. We infer this from TURN candidates.

With these four switches we can then proceed to infer the overall WebRTC connectivity of the client. If all switches are false then there can be no connection. If any of the outbound protocols is true then a WebRTC connection can be performed. And if any of the inbound protocols is true then a Peer to Peer WebRTC connection can be performed.

How Reliable Are the Results

The WebRTC connectivity results are “best guesses”, they emulate what is actually going on during the connectivity negotiation between the peers. However reality can and always is surprising. There is a 10% chance the client can perform the contrary of what the connectivity tests suggested.

For example even though connectivity tests pass the client may not eventually be able to establish a WebRTC connection because of content based firewalls, other peer’s IP being blocked or a number of other unforeseen and undetectable issues. The opposite is also possible, when no connectivity is detected it is still possible to establish a connection through a TLS / DTLS TURN connection (the secure TURN connection).

So the ultimate test is to actually try and perform the WebRTC call.

How to handle the results

The optimal way to use Netscan is for troubleshooting and network monitoring purposes. For example attach a Network scan on every ticket generated from your support portal, that way your Customer Service will be miles in front of the troubleshooting process since they’ll have critical information about the client.

If you are using WebRTC you should use Netscan when a failed call occurs to study why the failure happened and what you can do about it.

During runtime you should advise the user of their detected connectivity. Meaning, provide a warning in case poor or no connectivity was detected. You should not forbid your user from attempting a connection since it is possible the connection can be established.

A “poor connectivity” condition can also be inferred by examining the other network environment information available through Netscan’s Result Data Object (RDO), like XHR and Websockets ping latency tests, they should not be higher than 100ms or if it’s behind an http proxy.

However the applications and integrations are boundless, we’d love to hear how you use Netscan in your application!

猜你喜欢

转载自blog.csdn.net/dotphoenix/article/details/80469725