Scaling up versus scaling out: Increasing compute resources effectively

scale up vs. scale out
14 November 2023

In a nutshell: This article looks at the critical distinction between two primary scaling strategies—scaling up (vertical scaling) and scaling out (horizontal scaling)—in infrastructure management. We take a closer look at factors such as architecture, resource requirements, expected traffic patterns and other factors that need to be considered when you must expand your infrastructure footprint.

Table of Contents

Scaling is a fundamental concept in the world of application deployment and infrastructure management. For the context of this discussion, we are going to delve deeper into high-performance use cases such as real-time applications and specifically, gaming. 

But before we get there, it is important to understand the difference between scaling up versus scaling out and establish the relative importance (or otherwise) of servers and how certain use cases tend to depend on a crucial server for the application versus others that rely on the entire fleet equally.

Scaling up versus scaling out

Scaling up is adding memory or computing power to an existing server.

Scaling out is adding additional servers for more capacity.

The choice between scaling up (vertical scaling) and scaling out (horizontal scaling) depends on the specific requirements and characteristics of the application. 

Here’s a breakdown of when to use each method and some examples.

When to scale up?

Resource-intensive applications

Resource-intensive applications that require a significant amount of CPU, RAM, or other resources may benefit from vertical scaling. For instance, a database server handling big data may need more memory and CPU power to improve query performance. Legacy monolithic applications that cannot be easily broken down into microservices may also require vertical scaling to handle increased loads.

Planning for redundancies

Additionally, high-performance use cases bank on constant and reliable uptime. This is only possible if the network provider or service provider has backups and failsafes in place to prevent outages and other issues—there should always be a contingency plan to make sure that there is adequate redundancy to mitigate any unplanned failures. in situations where redundancy is challenging or expensive to implement, vertical scaling can help by increasing the capacity of a single server.

When is scaling out the best option?

Applications designed as microservices

Many modern web applications can benefit from horizontal scaling, especially when there’s a need to handle increased traffic or user demand. You can add more servers to distribute the load. Applications designed as microservices can easily scale horizontally by deploying multiple instances of each service, allowing for greater flexibility and agility.

Stateless applications

Applications that don’t rely on the maintenance of session state information can be easily scaled out, as requests can be distributed to any available instance without concern for preserving user-specific data. Additionally, containerized applications managed by orchestration platforms like Kubernetes can scale out by adding more containers or pods, making it easy to manage and distribute workloads. Video games that deploy servers based on matchmaking fit into this category.

Best practices of scaling up versus scaling out

Finding the right hardware and software mix

In practice, many applications may require a combination of both vertical and horizontal scaling depending on various factors, including the application’s architecture, resource requirements and anticipated traffic patterns. The choice between scaling up or scaling out should be made based on the specific needs of the application and the underlying infrastructure.

For example, if you use a dual-socket machine, each socket has its own memory attached to it. Hence if you run code on one CPU, but you need to access a memory on the other CPU, you have a latency hit. You would then need to develop your code accordingly to counter this additional latency. Hence, the interaction between the code and the hardware is also an important factor to consider.

Bandwidth considerations

Another factor that is important to consider is the bandwidth of the server. When we normally talk about bandwidth, we often only think about network bandwidth. But the bandwidth capacity inside the machine is also very important. Developers, network engineers and game-hosting experts must carefully optimize the design of the hardware to optimize latency. 

In summary

Scaling is a fundamental concept in application deployment and infrastructure management. Use cases determine whether vertical scaling (adding resources to a single server) or horizontal scaling (adding more servers) is more suited to specific needs. However, it is important to take factors such as architecture, resource requirements and traffic patterns into consideration when making the decision to increase resources. The importance of considering server bandwidth and optimizing the relationship between code and hardware for latency reduction must also be taken into account.

Main Take-Aways

Increasing compute resources becomes necessary whenever a product or service sees growth in the market. The best way to do this can vary depending on several factors. This article looks at features such as the types of applications that require different types of scaling alongside other considerations to make such as architecture, traffic patterns and more.