Cost Optimization for GKE Nodes and Regional Workloads on Google Cloud

Timeline: December 2025
Role: Cloud Engineer / Site Reliability Engineer
Skills: Google Kubernetes Engine (GKE), Node Pools, Machine Type Optimization, Regional Clusters, Pod Scheduling, VPC Flow Logs, BigQuery, Cost Optimization, Kubernetes Affinity


Project Summary

This project focused on optimizing the infrastructure cost of Google Kubernetes Engine (GKE) workloads by improving node pool sizing, workload placement, and inter-zonal traffic efficiency. The work involved analyzing the resource profile of a sample application, scaling the workload, migrating it to a more cost-efficient node pool with a better-shaped machine type, and then exploring the network cost implications of running chatty pods across zones in a regional cluster.

The implementation demonstrated how cost optimization in GKE is not limited to choosing cheaper machine types, but also depends on bin-packing efficiency, regional architecture decisions, and reducing unnecessary cross-zonal traffic.


Objectives

  • Examine node-level resource utilization of a GKE workload
  • Scale an application and observe infrastructure impact
  • Migrate workloads to an optimized machine type in a new node pool
  • Explore regional cluster design tradeoffs
  • Enable and inspect VPC flow logs for pod communication
  • Reduce inter-zonal traffic costs by changing workload placement

Architecture Overview

The architecture consisted of:

  • A zonal GKE cluster hosting a Hello application deployment
  • An initial node pool using smaller e2-medium machines
  • A second node pool using a larger e2-standard-2 machine type for improved packing efficiency
  • A regional GKE cluster spanning multiple zones
  • Two application pods deployed across different nodes and zones
  • VPC Flow Logs enabled for subnet-level traffic visibility
  • A BigQuery dataset used to inspect flow log data by source and destination zone
  • Pod scheduling changes using pod affinity / anti-affinity to influence zonal placement

Architecture Diagram


Implementation & Highlights

1. Understanding the Initial Cluster Shape

  • Examined the Hello demo cluster running on two e2-medium nodes
  • Reviewed node-level CPU and memory requests for the Hello application and GKE system components
  • Observed that CPU was constrained sooner than memory, indicating inefficient resource fit for the workload

2. Scaling the Hello Application

  • Increased the Hello application replicas from 1 to 2
  • Observed scheduling pressure and insufficient CPU conditions
  • Resized the existing node pool to three nodes to accommodate the added workload
  • Confirmed that scaling on the original machine type led to underutilized memory and suboptimal node efficiency

3. Optimizing Node Pool Machine Type

  • Created a new node pool using e2-standard-2
  • Cordoned and drained the original node pool
  • Migrated workloads to the new pool
  • Deleted the old node pool after workload migration
  • Demonstrated that the same workload that required three e2-medium nodes could run on a single larger node more efficiently

4. Cost and Bin-Packing Analysis

  • Compared the cost and packing behavior of smaller shared-core nodes versus a larger standard node
  • Identified that better workload packing reduced waste and slowed cost growth during scale-out
  • Reinforced the importance of aligning machine type selection to application resource shape rather than assuming smaller nodes are always cheaper

5. Exploring Regional Cluster Tradeoffs

  • Reviewed the tradeoffs between zonal, multi-zonal, and regional GKE clusters
  • Considered availability, performance, and cost implications across regions and zones
  • Positioned regional clusters as a strong availability option, but one that requires more careful traffic placement to avoid unnecessary network cost

6. Creating and Testing a Regional Cluster

  • Created a new regional cluster
  • Deployed two pods designed to run on separate nodes using pod anti-affinity
  • Generated traffic between pods using ping
  • Verified that the pods were initially running in different zones and communicating across zone boundaries

7. Enabling Flow Logs and Exporting to BigQuery

  • Enabled VPC Flow Logs for the subnet used by the regional cluster
  • Exported flow logs through a sink into a BigQuery dataset
  • Queried source and destination zones to identify cross-zonal communications between cluster nodes
  • Used log analysis to make cost-aware placement decisions based on actual network behavior

8. Minimizing Cross-Zonal Traffic Costs

  • Changed pod scheduling behavior from anti-affinity to affinity
  • Recreated the second pod so it would land on the same node as the first
  • Verified lower latency and reduced inter-zonal communication
  • Connected the optimization to lower VM-to-VM egress cost for chatty workloads within a regional cluster

Design Decisions

  • Used node pool migration instead of in-place reshaping to move workloads safely to a more efficient machine type
  • Used a regional cluster to study high-availability tradeoffs beyond a single-zone design
  • Enabled VPC Flow Logs and exported them to BigQuery to validate traffic patterns with data rather than assumptions
  • Used pod affinity and anti-affinity to control placement and reduce unnecessary cross-zonal communication
  • Treated cost optimization as a combination of:
    • workload fitting
    • node efficiency
    • cluster topology
    • network behavior

Results & Impact

  • Successfully optimized a GKE workload by moving from a less efficient node pool to a better-shaped machine type
  • Demonstrated practical use of:
    • node pool migration
    • workload scheduling controls
    • regional cluster cost analysis
    • flow log inspection
    • BigQuery-based traffic analysis
  • Improved understanding of how GKE costs are influenced not only by node pricing, but also by bin-packing efficiency and network traffic patterns
  • Built a strong case study for cost-aware Kubernetes platform operations

Tools & Technologies Used

  • Google Kubernetes Engine (GKE) – Container orchestration platform
  • Compute Engine machine types – Node pool optimization
  • Kubernetes Deployments – Workload scaling
  • Node Pools – Infrastructure segmentation by machine type
  • Pod Affinity / Anti-Affinity – Workload placement control
  • VPC Flow Logs – Network traffic visibility
  • BigQuery – Flow log analysis
  • Cloud Logging – Traffic export and inspection

Outcome

This project demonstrates the ability to optimize GKE infrastructure cost and efficiency by right-sizing node pools, analyzing scaling behavior, and minimizing unnecessary cross-zonal traffic in a regional cluster. It highlights practical skills in Kubernetes workload placement, infrastructure efficiency, observability-driven optimization, and cloud cost engineering, which are highly relevant to cloud engineering, platform engineering, and site reliability roles.


Back to Cloud Projects