Optimizing GKE Workloads with Container-Native Load Balancing and Reliability Controls

Timeline: December 2025
Role: Cloud Engineer / Site Reliability Engineer
Skills: Google Kubernetes Engine (GKE), Kubernetes Ingress, Network Endpoint Groups (NEGs), Container-Native Load Balancing, Locust, Liveness Probes, Readiness Probes, Pod Disruption Budgets, Availability Engineering

Project Summary

This project focused on improving the efficiency, resilience, and availability of workloads running on Google Kubernetes Engine (GKE). The implementation used container-native load balancing through Kubernetes Ingress and NEGs, load-tested the application to understand real usage patterns, configured liveness and readiness probes for safer traffic handling and self-healing, and applied Pod Disruption Budgets to reduce downtime during voluntary disruptions.

The project demonstrated how GKE workload optimization is not only about resource sizing, but also about improving traffic flow, health-aware routing, and disruption tolerance.

Objectives

Create a container-native load balancer through Ingress
Load test an application to observe runtime capacity
Configure liveness probes for container self-healing
Configure readiness probes for safe traffic admission
Apply Pod Disruption Budgets to improve availability during voluntary disruptions

Architecture Overview

The architecture consisted of:

A GKE cluster with VPC-native / alias IP networking enabled
A frontend application (gb-frontend) initially deployed as a single pod
A ClusterIP service annotated to enable Network Endpoint Groups (NEGs)
A Kubernetes Ingress that created a Google Cloud HTTP(S) Load Balancer
Container-native load balancing targeting pods directly through NEGs
A Locust-based load testing setup with one main service and multiple workers
A liveness probe demo pod for restart behavior validation
A readiness probe demo pod and LoadBalancer service for traffic gating validation
A replicated gb-frontend deployment protected by a Pod Disruption Budget (PDB)

Architecture Diagram

Implementation & Highlights

1. Provisioning the GKE Environment

Created a three-node GKE cluster with --enable-ip-alias
Deployed the gb-frontend web application as an initial single pod
Prepared the environment for container-native load balancing and workload optimization testing

2. Container-Native Load Balancing Through Ingress

Created a ClusterIP service annotated with cloud.google.com/neg
Created an Ingress resource to expose the application
Triggered provisioning of:
- a Google Cloud HTTP(S) Load Balancer
- zonal Network Endpoint Groups (NEGs)
Verified backend health using the Compute Engine backend service health check
Enabled direct pod-level traffic targeting, reducing extra network hops compared with node-based routing

3. Load Testing the Application

Built and stored a custom Locust image in the project container registry
Deployed Locust main and worker pods to generate traffic
Simulated moderate and burst traffic scenarios against the frontend service
Observed CPU and memory behavior of the gb-frontend pod under load
Established a more realistic baseline for future decisions around requests, limits, and autoscaling strategy

4. Liveness Probe Configuration

Created a demo pod with an exec-based liveness probe
Used the probe to monitor the existence of a file in the container filesystem
Simulated failure by removing the file
Observed Kubernetes detect the unhealthy state and restart the container automatically
Demonstrated self-healing behavior for failed but still-running containers

5. Readiness Probe Configuration

Created a demo pod with an exec-based readiness probe and exposed it through a LoadBalancer service
Verified that the service did not route traffic while the readiness condition was failing
Created the required file inside the container to satisfy the readiness probe
Confirmed that the pod became Ready and traffic began flowing successfully
Demonstrated health-aware traffic admission and safer service exposure

6. Pod Disruption Budget for Application Availability

Replaced the single frontend pod with a 5-replica deployment
Drained cluster nodes to observe the impact of voluntary disruption
Verified that without a PDB, the deployment could temporarily lose all available replicas
Created a Pod Disruption Budget requiring at least 4 replicas to remain available
Repeated the drain operation and confirmed that Kubernetes blocked further eviction when doing so would violate the budget
Demonstrated controlled availability protection during operational maintenance and node disruption events

Design Decisions

Used container-native load balancing to improve routing efficiency by letting the load balancer target pods directly
Used load testing before optimization decisions so changes would be informed by observed workload behavior rather than guesswork
Configured liveness probes to enable automatic restart of unhealthy containers
Configured readiness probes to ensure traffic would only reach pods that were truly ready to serve
Used a Pod Disruption Budget to preserve service availability during voluntary cluster operations such as drain and rescheduling
Treated optimization as a balance between:
- routing efficiency
- application responsiveness
- health-aware operations
- resilience during disruption

Results & Impact

Successfully implemented container-native load balancing for a GKE workload
Measured application behavior under different traffic levels using Locust
Demonstrated automated container recovery using liveness probes
Demonstrated controlled service exposure using readiness probes
Protected a replicated application from excessive disruption during node drain through a Pod Disruption Budget
Built practical experience in improving both efficiency and availability of Kubernetes workloads

Tools & Technologies Used

Google Kubernetes Engine (GKE) – Cluster platform
Kubernetes Ingress – External traffic entry
Network Endpoint Groups (NEGs) – Pod-level load balancing targets
Google Cloud HTTP(S) Load Balancer – External application access
Locust – Load testing framework
Liveness Probes – Self-healing controls
Readiness Probes – Safe traffic routing controls
Pod Disruption Budgets (PDBs) – Voluntary disruption protection

Outcome

This project demonstrates the ability to optimize GKE workload efficiency and reliability by combining better traffic routing, load testing, health-aware pod management, and disruption controls. It highlights practical skills in Kubernetes networking, application health management, and availability engineering, which are highly relevant to cloud engineering, platform engineering, and site reliability roles.

Back to Cloud Projects

Felix Otieno Arogo