Understanding Horizontal Scaling with Kubernetes

Gabrielle Witkowski
Product Strategy Lead
Sep 7, 2023 | 3 mins read

The rapid adoption of containerization as a fundamental cloud-native tenet cannot be ignored. It has evolved into a significant trend in cloud-native computing, transforming how applications are developed, deployed, scaled, and managed.

Part of the containerization ecosystem is the need for software that orchestrates, deploys, scales, and maintains the containers that make up a software application. Moreover, apart from containers and container orchestration software, there is a requirement for hardware infrastructure that is capable of easily executing and scaling containerized applications.

As far back as June 25, 2020, Gartner published a press release forecasting “strong revenue growth for global container management software and services through 2024.” Moreover, in the same press release, Gartner noted that global revenue from container management software will reach $944 million in 2024 from $465.8 million in 2020.

Container Orchestration Platforms: Which is Best of Breed?

Clearly, container orchestration software is here to stay. However, several container orchestration platforms are available for use as part of the cloud-native architecture or stack.

Therefore, the question is, which container orchestration software is best of breed?

The brief answer to this question is Kubernetes.

Kubernetes was initially developed by Google, open-sourced, and taken over by the Cloud Native Computing Foundation (CNCF) in 2016. Fast forward to 2020 with the publication of the Cloud Native Survey 2020, which reported that 83% of all enterprise organizations surveyed are using Kubernetes. Consequently, it is considered the de facto container orchestration platform.

While Kubernetes is a complex and sophisticated system with many moving parts, it is specifically designed to automate the development, deployment, scaling, and management of containerized applications. In lay terms, Kubernetes is very good at its core function, that of managing cloud-native, containerized applications.

The research paper published in the Journal of Cloud Computing describes Kubernetes as follows:

“Overall, Kubernetes offers a powerful and flexible solution for managing containerized applications in production environments.”

Horizontal Scaling with Kubernetes

As highlighted several times in this text, one of Kubernetes' core functions is to orchestrate, automate, and manage the scaling of the containers inside Kubernetes pods.

In summary, Kubernetes offers three types of scaling: horizontal, vertical, and cluster scaling. While vertical and cluster scaling are not part of the scope of this article, they play equally important roles in ensuring the high availability of a cloud-native application. However, the sole purpose and function of this discussion is to provide an overview of horizontal scaling, as well as its benefits and challenges.

An Introduction to Horizontal Scaling with Kubernetes

Horizontal scaling ensures that the application is horizontally elastic. The research paper titled “A Fine-Grained Horizontal Scaling Method for Container-Based Clouds” notes that horizontal scaling (or elastic scaling) results in the application’s ability to dynamically adjust based on the required workload at any given point by automatically adjusting the “number of pods in a replication controller, deployment, replication set, or stateful set based on observed CPU utilization.”

How?

The official Kubernetes documentation provides the answer as follows:

Kubernetes contains an API resource and controller known as the Horizontal Pod Autoscaler (HPA). Based on the application’s workload, the HPA automatically increases or decreases the application’s ability to handle an increased/decreased demand for resources by deploying additional pods or reducing the number of pods, respectively.

The Function and Purpose of the HPA

The best way to describe how the HPA horizontally scales a containerized application in and out is to consider the following use case:

The Scenario:

Imagine the scenario of a tech startup that provides a SaaS chatbot service that provides medical and psychological support, as described in the research paper titled “Using Chatbots to Support Medical and Psychological Treatment Procedures.”

This bot is designed to provide support before and after medical procedures such as colonoscopies and hip replacements, coach patients diagnosed with chronic conditions like Diabetes and Chron’s Disease and provide essential psychological support for people diagnosed with Autism and severe depression.

The Chatbot’s Architecture:

The chatbot is powered by a Large Language Model (LLM) known as the MMLU dataset and a vector database. The company has added in Retrieval Augmented Generation (RAG) to prevent the LLM from hallucinating, increase its performance, improve responsiveness, and answer questions in very close to real-time.

Additionally, the chatbot is developed using a containerized microservices architecture, with the containers being deployed, scaled, and maintained using Kubernetes. Each container is deployed as part of a Kubernetes pod with the set of pods deployed in a Kubernetes Deployment. The vector database contains the LLM stored as a series of vector embeddings that, when combined with the RAG, returns the answers to patient questions quickly and efficiently.

Note: The company expects variable workloads throughout a 24-hour period with peak traffic in the evenings and over weekends.

Figure 1: The Kubernetes Workflow

To manage these variable workloads, the Kubernetes administrator has configured the HPA to automatically scale out/in (up/down) the number of chatbot service pods based on the actual versus the configured CPU utilization. The goal is to ensure that there are always enough chatbot pods running to handle the current workload (incoming queries) while avoiding over-provisioning.

The Benefits of HPA

Efficient resource utilization is the most significant benefit of using Kubernetes’ HPA to automate the scaling in and out of the containerized microservices. The HPA dynamically allocates resources, ensuring high availability irrespective of the workload. This leads to other benefits, such as cost savings, scalability, and optimal user experience.

Final Thoughts

Circling back to our opening statement, cloud-native technologies, including containerization, lead the way in scalable, highly available applications with 99.99% uptime, ensuring an excellent user experience at every touchpoint.

The quality of user experience is ultimately the end goal. A good user experience leads to increased adoption, resulting in sustainable organizational growth over time and, in our scenario, the ability of the startup to evolve into a mature organization.

Understanding Horizontal Scaling with Kubernetes

Understanding Horizontal Scaling with Kubernetes

Container Orchestration Platforms: Which is Best of Breed?

Horizontal Scaling with Kubernetes

An Introduction to Horizontal Scaling with Kubernetes

The Function and Purpose of the HPA

The Scenario:

The Chatbot’s Architecture:

The Benefits of HPA

Final Thoughts

Read More

Cloud

DevOps Culture and Practices for Cloud-Native Application Development

7 mins read

Kubernetes

The Broad Strokes of Scaling Cloud-Native Applications with Kubernetes

7 mins read

Cloud

Building Scalable and Resilient Cloud-Native Applications

7 mins read

Microservices

5 Advantages of a Microservices Architecture for Your Business

3 mins read

Cloud

Introduction to Cloud-Native Applications: What They Are and Why They Matter

6 mins read

Kubernetes

What are the Challenges and Considerations for Running Stateful Applications on Kubernetes?

5 mins read

Cloud

Scaling and Auto-Scaling Strategies for Cloud-Native Applications

8 mins read

Kubernetes

Load Balancing in Kubernetes: What, Why, How?

4 mins read

Microservices

12 Features of Microservices

3 mins read

Kubernetes

How to Monitor and Troubleshoot Applications Running on Kubernetes

7 mins read

Banking

What do customers want from digital banking?

9 mins read

Kubernetes

How to Scale Applications and Manage Resources Effectively in Kubernetes

4 mins read

Banking

Why Banks Should Consider Becoming Third-Party Providers

3 mins read

Banking

How Cloud technology is powering Banking APIs

3 mins read

Microservices

Scaling Microservices Architectures in the Cloud

2 mins read

Banking

ISO 20022 Migration: Considerations For Banks

2 mins read

Microservices

Microservices - The issue of Granularity: Atomic or Composite?

2 mins read

Microservices

10 Challenges to implementing Microservices

2 mins read

Schedule demo

Get In Touch

Download Product

Schedule demo

Get In Touch

Download Product

Post Applied For -

12 Features of
Microservices

ISO 20022 Migration: Considerations
For Banks