Verified by Garnet Grid

How to Control Kubernetes Costs: Resource Limits, Autoscaling, and Spot Nodes

Reduce Kubernetes infrastructure costs by 40-65% with proper resource requests, cluster autoscaling, spot node pools, and namespace quotas.

Kubernetes makes it trivially easy to waste money. Every pod without resource limits is an open checkbook. Every idle node is a bill you’re paying for nothing. This guide shows you exactly how to bring costs under control.


Step 1: Set Resource Requests and Limits on Every Pod

The single most important cost control mechanism in Kubernetes. Without resource requests, the scheduler can’t pack pods efficiently. Without limits, a single pod can consume an entire node.

1.1 Determine Actual Usage

# Get CPU/memory usage for all pods in a namespace
kubectl top pods -n production --sort-by=cpu

# Get node-level utilization
kubectl top nodes

# For historical data, use Prometheus queries
# container_cpu_usage_seconds_total
# container_memory_working_set_bytes

1.2 Apply Resource Specs

# Deployment with proper resource management
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: api
          image: api:v2.1
          resources:
            requests:           # Used for scheduling
              cpu: "250m"       # 0.25 CPU cores
              memory: "512Mi"   # 512 MB
            limits:             # Hard ceiling
              cpu: "1000m"      # 1 CPU core
              memory: "1Gi"     # 1 GB

:::tip[Sizing Strategy] Set requests to the P50 (median) usage and limits to the P99 (peak) usage. This ensures efficient packing while preventing OOMKilled situations. :::


Step 2: Implement Namespace Resource Quotas

Prevent any single team from consuming the entire cluster.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-alpha-quota
  namespace: team-alpha
spec:
  hard:
    requests.cpu: "16"
    requests.memory: "32Gi"
    limits.cpu: "32"
    limits.memory: "64Gi"
    pods: "50"
    persistentvolumeclaims: "20"
# LimitRange sets defaults for pods that don't specify resources
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: team-alpha
spec:
  limits:
    - default:          # Default limits
        cpu: "500m"
        memory: "512Mi"
      defaultRequest:   # Default requests
        cpu: "100m"
        memory: "128Mi"
      type: Container

Step 3: Enable Cluster Autoscaler

The Cluster Autoscaler automatically adjusts the number of nodes based on pending pods.

3.1 Azure AKS

az aks update \
  --resource-group myRG \
  --name myCluster \
  --enable-cluster-autoscaler \
  --min-count 2 \
  --max-count 20

3.2 AWS EKS

# Cluster Autoscaler deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
spec:
  template:
    spec:
      containers:
        - name: cluster-autoscaler
          image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.29.0
          command:
            - ./cluster-autoscaler
            - --cloud-provider=aws
            - --nodes=2:20:eks-nodegroup
            - --scale-down-delay-after-add=5m
            - --scale-down-unneeded-time=5m
            - --skip-nodes-with-local-storage=false

3.3 Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 65
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 75

Step 4: Use Spot/Preemptible Node Pools

Spot nodes provide 60-90% savings for fault-tolerant workloads.

4.1 Create a Spot Node Pool

# AKS Spot Pool
az aks nodepool add \
  --resource-group myRG \
  --cluster-name myCluster \
  --name spotpool \
  --priority Spot \
  --eviction-policy Delete \
  --spot-max-price -1 \
  --min-count 0 \
  --max-count 15 \
  --node-vm-size Standard_D4s_v5

# EKS Spot Instances via managed node group
eksctl create nodegroup \
  --cluster myCluster \
  --name spot-workers \
  --instance-types m5.xlarge,m5a.xlarge,m5d.xlarge \
  --spot \
  --min-size 0 \
  --max-size 15

4.2 Schedule Tolerant Workloads on Spot

spec:
  tolerations:
    - key: "kubernetes.azure.com/scalesetpriority"
      operator: "Equal"
      value: "spot"
      effect: "NoSchedule"
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: "kubernetes.azure.com/scalesetpriority"
                operator: In
                values: ["spot"]

Step 5: Implement Pod Disruption Budgets

Protect critical services during node scale-down and spot evictions.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
spec:
  minAvailable: 2          # Always keep 2 pods running
  selector:
    matchLabels:
      app: api-service

Step 6: Schedule Non-Production Shutdown

Dev/staging clusters don’t need to run 24/7.

#!/bin/bash
# Cron: scale down dev at 8 PM, scale up at 7 AM
# Scale down
kubectl scale deployment --all --replicas=0 -n dev
kubectl scale deployment --all --replicas=0 -n staging

# Scale up (separate cron job)
kubectl scale deployment --all --replicas=1 -n dev
kubectl scale deployment --all --replicas=1 -n staging

Cost Optimization Checklist

  • Resource requests/limits on every container
  • Namespace ResourceQuotas for team budgets
  • LimitRanges for default container limits
  • Cluster Autoscaler enabled with appropriate min/max
  • HPA for variable-traffic deployments
  • Spot node pools for fault-tolerant workloads
  • PodDisruptionBudgets on critical services
  • Dev/staging scheduled shutdown
  • Regular review of kubectl top data

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For enterprise Kubernetes cost audits, visit garnetgrid.com. :::