Platform integrations & its lock-in risks for game developers

Play Video
22 April 2021

Interest in online gaming continues to grow, with the medium becoming more popular than ever during our socially-distanced times. Arguably, the pandemic will leave a lasting impact on the make-up of games with more multiplayer and online aspects in the pipeline.

Table of Contents

With this in mind, game developers should take a closer look at how their design and game hosting decisions impact the game (and the bottom line!) in the long run. In my opinion, five elements factor in heavily on those calculations. These include the following decisions you make during game development as influenced by your intended global reach and latency constraints:

In this part of the written version of our webinar, we will discuss integration options & lock-in in more detail. You can also watch the recording of our webinar above and click on the integration options & lock-in chapter for more details on this subject. Please remember to accept all cookies, otherwise the video might not show up for you.

“Ultimately what I would recommend is go for the total freedom of choice.”

Stefan Ideler, CTO i3D.net

Integration options & lock-In

In today’s situation barely anyone puts all their load on a single cloud provider, nor do they buy 100% of the hardware projected to be needed for the game. And if they use either of those options, it is usually due to legacy, platform lock-in or they haven’t had a negative experience yet where a single provider faced issues like the servers were sold out, they required large commits, had problematic latency, or unexpected large bandwidth bills.

However, the cloud providers are not stupid. It’s in their best interest to lock you into their platform, and preferably theirs alone. Generally, their platform is very easy to set up yet very difficult to get rid of once you’ve implemented the platform’s unique features.

We’ve had many conversations with (potential) customers who were using specific features of a cloud platform, which prevented them from using other hardware resources, even if in certain regions that cloud platform was not available.

But no-one will deny that because of the ease of development, the hyperscalers are extremely well positioned. It’s just that once your game does become successful, the costs will start to hurt if you did not prepare your game to have multiple options to work with.

The core point is to work with multiple cloud providers, which ensures you have some more chips to bargain with. In that case, you have an alternative in case an instance type is sold out, has problems, or they simply refuse your quota requests. There are also large parts of the world which are underserved by the same cloud providers. For example, there are significant gaps of coverage of any single cloud provider in regions like Russia, Middle East, Africa, and Latin America.

If you pick through the marketing, dealing with clouds on large volumes is almost the same as dealing with large bare metal suppliers. It’s all about the numbers you’re willing to commit spend per month (or year on year), the term of the individual server commits and how much you want to pay up front to get that capacity guaranteed when you need it during a burst. Of course, on an induvial VM level you’re flexible. However, when looking at the large numbers (which what any somewhat successful game quickly becomes), you’ll need to make arrangements with your cloud providers. If you do not and go in unprepared, you would risk getting stuck, hit with costs you did not foresee and end up with a problem that could seriously affect the bottom line of your game.

Ultimately what I would recommend is go for the total freedom of choice. Do not lock yourself into a technology which -only- allows you to use compute of a certain flavour. Instead, build your own or use a platform which has an SDK which is easy to integrate and allows you to use it with any from of cloud or compute. That way you combine static bare metal capacity for your base game needs and using multiple cloud providers per region for your flexible load.

Common forms of game servers

What is also interesting to cover is the 3 main common forms of game servers we see, as the technology aspect keeps evolving as well.

The original is what I would like to call Fire & Forget; the game server starts up and keeps running 24/7. It then reports to a server list or matchmaker and players scroll through these lists. Sometimes these contain thousands of servers, where players have the choice of choosing which server they’d like to join. In some cases, they want to play with a friend or fellow clan-member, for example. It makes it easy for them to pick and choose, even if the game is already in progress.

Fire & Forget servers generally don’t have dependencies on platforms or technology stacks and as such are easy to work with. As an indie developer, for example, this is by far the easiest way to set up a live game. However, they are harder to optimize in terms of costs since you’ll have a lot of underutilized capacity running which is essentially burning resources.

To solve this, a new type of game server appeared which made use of an Allocation model. In this model, players can usually only join a new fresh match and have no options to select a server. When players, usually of the same rank or skill level, want to start a match the matchmaker will send an allocation request to a platform service. The platform will then provide a game server which is ready to accept these players and the match is made. It’s the platform’s task to keep just enough game servers in a ready online state and to serve the amount of incoming new player match requests from the matchmaker.

In practice this means that in general, a warm buffer pool is used for this purpose with just enough servers to account for the time it would take to start up a new bare metal or virtual machine. The pool also always ensure players can play. On a downtrend of player amounts, for example after the evening peak, it’s the platform task to terminate VM instances and flexible bare metal capacity which greatly reduces the resources needed to host a game.

With an Allocation-based model you therefore are locking yourself into a platform of choice, ideally one where you have freedom which compute resources you can use. The flipside is that it is a lot more cost effective than the Fire & Forget model. Another benefit is that players don’t have to scroll for ages through a huge server list to find a suitable match.

The third variant is a Container-based model for game servers, which is more of a recent to future development. Up until recently most server-side game servers have been monolithic large processes. The only way to scale it is to scale vertically with even more GHz per core, just to squeeze out that last bit of performance. From our experience multithreading support for game server processes is rare, so the focus was always on those Ghz per core.

What we see now in the whole industry are developments where teams try to engineer their game around this problem. Namely using many different small processes, otherwise known as microservices, that can individually scale up and down based on the specific demand of the game world. Sometimes this is also referred to as ‘cloud native’ game servers. This is not necessarily a cost-saving measure. We believe it can potentially greatly improve the player world and create even more immersive experiences by making more things possible for the developers.

When you are no longer limited by a single physical box (a VM, is also in the end, limited by its physical components), the possibilities are endless. Personally, I’m eagerly looking forward to what these initiatives can bring to the gamers. Perhaps it will fail, but maybe it will be such a revolutionary game experience that it will become the new triple A standard.

Let’s look at a real-life example of the vendor-locking I described earlier in this section. Some time ago we met with a developer of a promising Battle royal game, which later turned out to be one of the games that defined the genre. At the time, the game was a sleeping hit, but from the game development perspective the costs were also quickly going up with their cloud solution.

Unfortunately, they were technically lock-in. With the success of the game, which required a lot of attention, they simply could not free up enough resources to break out of their lock-in at that time.

The game turned out to be a monster hit, but with a significant infrastructure cost. After the big rush passed, the game developer did end up making the choice to break free of the technical lock-in and started using other platforms which allowed more freedom.

For this reason, I’d really recommend looking at your options during game development. Once your game is out the door and you become successful, you’re going to want to focus on your player’s demands. However, if you think about the hidden costs of breaking out of a technical lock-in during game development, you can greatly reduce costs over the entire lifecycle of your game.

Main Take-Aways

While developing your game, you should seriously consider what type of server authority you want for your game. For most live-games, I would not recommend a simple peer-to-peer model since it can heavily affect the player experience and potential income of your game.