Improving WebRTC performance: The network level

Improving WebRTC performance: The Network Level
28 March 2023
Imagine you’re on an important call while commuting to work and suddenly the line goes choppy, before you’re completely cut off for a few seconds. At last, you see the sunlight again and you drive out of a tunnel, with your connection restored but valuable information missing as a result. This may not happen as much in 2023 as it did in 2010, but in today’s world, there is a newer variant of the same issue. The symptoms range from choppy audio, frozen video, PowerPoint slides delayed, disconnections and the list go on, but we probably don’t have to tell you that.

Table of Contents

The drops are not caused by you driving through an actual tunnel, obviously, even though it may very well feel that way. So, what does cause these issues then? Well… as with anything, the answer is that “it depends”. In this blog post we’ll dive deeper into underlying topics that may drive instability in networks and how you can take control to solve this.

Assess the quality of your providers’ network for WebRTC servers

In our earlier blog, we explained the basics of things to look out for when selecting a compute & connectivity provider for your Real-Time Communication (RTC) application.

In this article, we explain how you can save on costs when selecting infrastructure providers from a bandwidth management perspective as well as some points to look out for:

1. Test network quality

Don’t just use looking glasses or testing mechanisms for a single moment in time. Get your hands dirty and do live tests to assess network quality over a prolonged period.

2. Check physical and network presence

Locations and regional connectivity matters much more than you would realize. Getting that edge location is great, but don’t forget to investigate peering connections to your users’ ISPs and make sure they are local.

3. Monitor network performance

Locations and regional connectivity matters much more than you would realize. Getting that edge location is great, but don’t forget to investigate peering connections to your users’ ISPs and make sure they are local.

RTC applications run best on a consistent and stable network connection to ensure a smooth and uninterrupted experience for your users. It’s just that no one network is designed the exact same way, making some more stable than others. While there are software tools out there that smoothen unstable networks or resolve connectivity issues, it is always better to prevent having to use additional overhead by selecting a network that is as stable as possible.

Let’s explore a few underlying topics that may drive instability in networks.

Network factors that affect the performance of your WebRTC application

A deeper dive into common terms: Jitter, latency, and packet loss

If you notice jitter, latency, packet loss or any of the symptoms highlighted in the introduction of this blog, you might need to take a closer look at your provider. Congestion, faulty hardware, or routing through longer-than-necessary routes could all be reasons for a degraded service for your RTC application. But what does this truly entail within the context of RTC? Let’s take a closer look at what affects the performance of your RTC application from a network performance perspective.

Understanding network congestion in RTC

Network congestion refers to a situation where the volume of data being transmitted over a network exceeds its capacity to handle it. Congestion occurs when there are more data packets trying to use a network than the network can handle, resulting in delays, packet loss and reduced network performance. The effects of network congestion can be particularly detrimental to RTC applications such as video conferencing. Congestion can result in poor-quality video and audio, delays in communication, and even dropped calls — things we don’t want to see when we’re video calling with our loved ones or in your daily work meetings.

Network congestion can be caused by a variety of factors, including:

1. High network utilization

When a network is utilized at its full capacity, it can become congested, especially during peak hours.

2. Large file transfers

Transferring large files such as videos or images can use a significant amount of network bandwidth and cause congestion.

3. Network hardware issues

Network hardware such as routers and switches may fail or not have sufficient capacity on the redundant path, resulting in congestion.

4. Malware and security attacks

Malware or security attacks that generate a large amount of network traffic can also cause congestion.

5. Inefficient routing

Inefficient routing can cause network congestion by sending data packets on scenic routes (from Dubai to Dubai via London for example), increasing latency and causing delays.

Understanding latency in RTC

Latency is a measure of the delay between the time a packet is sent from one point on a network to the time it is received at another point on the same network. For simplification, the time it takes from me saying ‘good morning’ on a video call till the time you hear me saying good morning on a video call.

As we have seen, network congestion may be a factor that increases latency as data packages wait to be transmitted through the network. This, however, shouldn’t be the case 99% of the time for a network if it is professionally designed. There are some other factors that may add latency:

1. Physical distance

The actual distance between two points on a network causes latency as the data must be transmitted over a physical fiber with the speed of light as its limitation. So, you can expect latency to be higher when FaceTiming friends on the other side of the planet compared to FaceTiming your loved ones in the next town over.

2. Network equipment

Routers, switches, and firewalls — any hop the data packets take add latency within the same network as between networks. See it like a postal delivery driver: if they could deliver your package in a straight line, it’d be much quicker, but they must take the road system and follow the places where you can get on and off the highway. After all, a packet is not carried along to all the branches before it reaches its destination.

3. Software

Any type of software can add latency, for example due to protocol overheads or even buffering. Buffering is very normal in networking; it takes place all the time when there is a speed difference between the input and output port of the device, and it happens when switching between ports with different transmission speeds. However, when switching between ports with the same transmission speed, buffering is not needed. The 25GbE standard was launched in 2015, five years after the introduction of the 40GbE/100GbE standards in 2010, because it uses the same transmission speed as the widely used 10GbE standard.

Understanding packet loss (uplink and downlink) in RTC

Packet loss occurs when data packets being transmitted over a network do not reach their intended destination. Packet loss can occur in both uplink and downlink transmissions, depending on whether data is being sent from the user’s device to the network or from the network to the user’s device. Packet loss can be caused by a variety of factors, one of them being network congestion as we’ve already discussed, but it can also be network equipment or hardware, software issues, security measures such as firewalls, or network routing issues such as loops or misconfigured routes.

For real-time communication applications, packet loss is one of the worst issues you can encounter as it has a significant impact on the user experience. Not only can it cause distortion and artifacts, but it can lead to a complete loss of audio or video data, making your presentation completely inaudible.

Understanding jitter in RTC

Jitter is a variation in the delay between the arrival of data packets over a network. In other words, jitter is the difference in the time it takes for different packets to arrive at their destination. Does it sound like latency? Yes, but no. Latency refers to the time delay between sending a data packet and receiving a response, while jitter refers to the variation in the delay between the arrival of data packets.

Queuing or buffering can be a cause of jitter, which is a result of variable latency. Jitter can have a significant impact on your users’ experience in RTC applications as it causes audio or video data to be out of sync or distorted.

Now that we’ve covered the fundamental elements that can make your video call feel like you’re driving in a tunnel, let’s talk about how you can get started with infrastructure providers to ensure as stable a network as possible.

Evaluating network providers for best WebRTC performance

We’re not going to get into the nitty gritty about networks here but rather keep the discussion relevant to RTC applications as that is the part often overlooked. You may have found an infrastructure provider with the best global network on paper — how does it stack up in the real world?

Private backbone

Does your connectivity and/or compute provider own its own backbone? If so, this will be great for when you have users on your platform communicating with each other from different countries and even continents. A private backbone in this case provides your infrastructure provider with additional control on how to route the data packets for your RTC application without having to hand it over to a third-party network provider.

This control in turn typically results in the following:

1. Reduced latency

A private network backbone can provide a more direct and faster connection between servers and clients, resulting in reduced latency and improved real-time communication performance for WebRTC applications.

2. Improved security

With a private network backbone, traffic between servers and clients can be kept within the private network, providing an additional layer of security against potential threats and attacks.

3. Better quality of service

A private network backbone can provide more control over network traffic and bandwidth, allowing for better prioritization and management of traffic to ensure consistent quality of service for WebRTC applications.

4. Increased reliability

Private network backbones are often designed with redundancy and failover mechanisms to ensure high availability and minimize downtime for WebRTC applications.

5. Customization

With a private network backbone, businesses have more control and flexibility to customize their network to meet the specific needs of their WebRTC application and users.

Redundancy

A follow-up question would then be how redundant is their network? If there’s only one connection between the US and Europe and that connection is faulty, you’d still be in trouble, and either have an instable network, or a network that must send data packages across the world in turn due to scenic routing. Two lines across the ocean? Better. Three? Great.

Link sizes

Regardless of if we are talking about a backbone or connections to other providers (especially eyeball networks), it would be good if you knew how much data your application would be sending to a specific group of users during rush hour. For example, if your provider has a 1G connection with Internet Service Provider X, but at peak hours your application demands a 10G connection, you’re going to be having problems very often. Network congestion will kick off a series of issues like jitter, packet loss or latency. If you’re then hosted with a provider that doesn’t, or simply can’t upgrade to a larger uplink, you’d regret not doing your due diligence.

Peering

It’s not only the capacity of the uplinks to internet service providers and eyeball networks that are crucial for your most important target users. It is also the number of them that are directly connected to your network provider of choice. The fewer hops, the quicker your data packets will arrive at their end destination — you’d want to make sure that your network of choice is peered directly with the most important ISPs and eyeball networks for your customers.

Geographic disparity

This should be a no brainer: the closer your compute resources are to your user, the better. We’ve put this on the lower end of our list though as sometimes this is used without rationale. Technically, the closer your compute resources are to your user does not always mean that it is better.

Those five themes cover the most important aspects you should ask yourself when selecting a network provider for real time communication applications. There are, however, two other issues you should be looking into.

Compute resources

As all your data packets will have to go through the network interface card (NIC) in the compute resources, you’re using it will pay off to investigate what NICs your provider is using as a standard. It might also be worth looking into the CPU offloading capabilities of those NICs, so you can cram more users onto the same box without sacrificing quality — lower costs and less power used, which is also better for the environment.

Software

Aside from the above points about infrastructure providers, don’t forget to take a good look at the software you’re using as well. What if for instance, if you’re using a third-party Software Development Kit (SDK) provider for your RTC application? Run some tests with different SDK providers if you still have the opportunity as the software layer may have a significant impact on how your users’ experience when the network is instable.

Some infrastructure providers use software that overlays on the public internet to optimize performance for last-mile connections and device optimizations. Be mindful of these, especially when combining with third-party WebRTC SDKs as you won’t know how compatible both are together.

Don’t just trust the numbers: Real-life network testing is critical for RTC applications

Regardless of whether you are scaling up your application, looking for alternatives to save costs or just started building a real time communication application, you must move beyond testing on paper, and you need to get out there and test real-life conditions. This is because you must remember the all-important rule that every network is unique. Each and every one of them.

What does that mean? Well, it means that each infrastructure provider has different networks, designed differently, using different network equipment and with different connections. But it also means that the network from infrastructure provider A is not always functioning the same way, every single minute of every single day. What happens to your application when provider A has network congestion? How does it affect your users? That outcome you’ll only truly see when it happens and the differences between network providers may be stark.

So, the easy answer here is, do your homework and select the best provider on paper. The difficult answer is; do your homework, select the best providers on paper, utilize their services for a prolonged period and monitor how your application works on their infrastructure and their network. Then confirm your own research with your users — what results do you get on the mean opinion scores using each provider? If that matches up with the network at the top of your list from your own monitoring, you have a winner. Simple? No… but will you have a battle-tested infrastructure strategy for your RTC application for when you scale up? Yes.

That about sums it up. Network performance is crucial for the stability of your RTC application. All we want to leave you with is the reminder to not jump in too quickly with one provider — do your research, do real-world tests and validate your findings with your end users.

Main Take-Aways

Network performance is crucial for the stability of your RTC application. All we want to leave you with is the reminder to not jump in too quickly with one provider — do your research, do real-world tests, and validate your findings with your end users.

Get in touch with our experts and discover what our network can do for your game.