workload-resizer

A Kubernetes controller for GKE that uses the in-place pod resize subresource (pods/resize, GA in K8s 1.35) to adjust pod resource requests when pods land on node types whose performance characteristics differ from the type the workload was originally calibrated for.

Get started Source on GitHub

When a Deployment is sized for one machine family (say n2d) but the scheduler places its pods on a more powerful one (n4, ~25% more CPU per core), the original requests over-provision capacity. workload-resizer watches scheduled pods, reads the assigned node’s cloud.google.com/machine-family label, and patches the pod’s requests in place using a configurable performance-unit matrix — without restarting the container.

No-restart resize

Adjustments land on running pods via the in-place resize subresource (GA in K8s 1.35). Containers keep their identity, sockets, in-memory state; the kubelet adjusts cgroup limits live.

Read more

Heterogeneous-pool aware

One Deployment, many machine families. The controller normalizes effective capacity at scheduling time using per-family performance units — calibrate once, run anywhere your cluster has nodes for.

Read more

QoS-class preserving

When a pod is Guaranteed (requests == limits), the resize mirrors the request change into the limit so the API server doesn’t reject the patch. Original template requests are captured as annotations before the first resize, so controller restarts don’t compound or undo changes.

Read more

Install

ConfigMap first, then the controller. Reversed order works too but the controller pod crash-loops briefly until the ConfigMap lands — applying in this order avoids that. For non-GKE clusters, edit config.yaml first to match your nodes’ machine-family label values.

URL=https://github.com/gke-demos/workload-resizer/releases/latest/download
kubectl apply -f $URL/config.yaml
kubectl apply -f $URL/install.yaml

The Install guide covers prerequisites, how to inventory your cluster’s machine families, picking performance units, and verifying with a sample workload.