Maximize game server infrastructure efficiency: True cost per CCU

game server efficiency i3d.net
21 February 2023

Regardless of your infrastructure strategy it will always be crucial for your game to optimize the true Cost per Concurrent Users (CCU). There are many different methodologies to measure the success of your game, some do this through Monthly Average Users (MAU) or Daily Active Users (DAU), metrics that you are used to seeing on Steam charts and other sources of data, these statistical milestones measure how many people play the game over time. However, there are two different metrics that we deem more important for an efficient infrastructure footprint: CCU and Peak Concurrent Users (PCU).

Table of Contents

PCU vs. CCU

CCU refers to the total number of players online at the same time whereas PCU is essentially the highest number of CCU your game experiences over a certain amount of time. In the chart below for example, you’ll see the CCU denominated per hour, e.g., at 1PM the game had 650 CCU, and at 5AM 500 CCU. The PCU, however, is the highest CCU your game has experienced over a specific periodif we look at the graph again, during this timeframe, the PCU would be at 12AM with 1800 CCU.

CCU denominated per hour graph

The simple reason why CCU and PCU are important metrics for setting up and managing an efficient infrastructure strategy is as you’ll need to match the CCU (and potentially PCU) with your infrastructure resource needs over time.

Maximize game server infrastructure efficiency

The true cost of over-committing

Let’s take the graph above as an example again 

In this specific example, if you are running compute resources that can support over 1800 CCU at any given time you’d be overcommitting, leading to the incurred cost of wasted resources. However, even if you’d have, let’s say compute resources running for 1600-1800 CCU, you may still be over-committing on compute resources depending on the price offerings from different vendors (in some cases simply committing your PCU would still be economically more efficient).  

The true cost of under-committing

Worst case, this would make your community unable to play your game, since your CCU would be higher than the capacity of your compute resources, or they’d need to wait to access servers. The good thing for the studio is obviously that their bill for infrastructure would be lower, but at what cost? Losing players and getting flak for it online in the community?  

So, both over- and under-committing have their downsides, and committing the exact amount needed is very difficult as the economics of player forecasting (especially prior to launch) are very difficult and always based on assumptions.  

Finding the perfect middle ground between bare metal and cloud

This has been an issue with multiplayer games prior to the cloud, today everyone is aware that you can commit cloud resources per hour, and sometimes even per minute, allowing you not to have to over-, nor under-commit your resources, but simply leverage your provider’s flexibility to scale up and down on resources as you need. This, however, also has its downsides, as your vendor in this case would not only charge for compute resources, but also monetize on their risk of having the machines idle. In other words, the price for compute resources is generally more expensive in this model. Calculating that middle ground of committing dedicated machines such as bare metal, and the cost of scaling up in the cloud is therefore an important exercise for you to run an efficient infrastructure footprint.  

Calculating the cost of your compute resources based on CCU

Calculating the amount of compute resources you’d be needing as well as the associated costs is usually no easy feat. There are various data points that are relevant in this decision-making process, such as:  

If you know your estimated CCU and the amount of players per game session, you can calculate an estimated number of game sessions running simultaneously over time.  

If you know the amount of RAM and cores you would need per game session, you can calculate how many compute resources are necessary to accommodate the CCU.  

And if you know how much bandwidth egress each player uses during one game session you can calculate how much bandwidth you would utilize during a specific timeframe.  

The answers to the above three questions put together would give you the basic metrics needed to further calculate your ‘’base-layer’’ or the point at which committing resources becomes economically less efficient for your game.  

Let’s use a very simple example based on the graph above, where we use the following specifications: 

Based on the above assumptions, during our PCU, we would need to accommodate 180 game sessions, and therefore we’d need (excluding any overhead or inefficiencies): 360GB of RAM and 180 cores with a ratio of 2:1 with 90GB (1800*0.05) of bandwidth used during these 180 sessions.

If you do this for multiple timeframes over a longer period, you would be able to calculate the amount of compute resources needed to accommodate your CCU. With these numbers, you could then input prices from various providers from bare metal to the major cloud companies and calculate your optimal efficiency point, cost wise. Do not forget however, to include the costs for bandwidth, as they may add up quickly and change your optimal efficiency point. Many game developers ignore this cost, only for it to skyrocket later and add a substantial amount to operating costs in the long run. You must also consider the costs for scaling up and down through an orchestrator by a third party (usually billed on API calls, or other usage-based metrics) or, if you build the game server backend yourself, the cost of your own team.

Forecast and player prediction

We have treated the calculation of player prediction very lightly in this article to make the idea more understandable, however, we are very much aware that this may very well be the most difficult aspect in calculating the optimal point for your compute resources prior to launch as it is simply difficult to predict consumer behavior and interest. Sure, factors such as marketing, competition, and past performance (in the case of a running IP) may greatly influence your forecasting ability, but it is still impossible to perfectly predict the success of a game. This is especially the case in the ever-evolving market we are operating in. Therefore, no matter how you’d calculate your most efficient compute resource strategy, it is always good to make sure you have some sort of scaling mechanism ready to go should your CCU go through the roof at launch, or a way to scale down on resources to ensure you’re not stuck with the costs of infrastructure without the players.  

Main Take-Aways

Maximizing game server efficiency is crucial in reducing the cost per concurrent user. There are multiple approaches to calculate the cost of your compute resources based on CCU, such as compute costs based on estimate CCU over time, the RAM-to-core Ratio, players per game session and more.