Setting Resource Requests and Limits in Production Kubernetes

Set resource requests based on observed usage, apply limits cautiously, and validate with real metrics.

JR

2 minute read

Set resource requests based on observed usage, apply limits cautiously, and validate with real metrics.

Practical Workflow

  1. Baseline metrics: Use kubectl top or Prometheus to observe historical CPU/memory usage.
  2. Set requests: Allocate 20-30% above observed averages to buffer for spikes.
  3. Apply limits: Use limits only if you need strict resource isolation (e.g., batch jobs). For long-running apps, omit limits unless throttling is acceptable.
  4. Monitor post-deploy: Check for OOM kills, throttling, or performance degradation.
  5. Iterate: Adjust based on metrics, not guesses.

Concrete Policy Example

CPU:

  • Request: 150m (guarantees 15% of a core)
  • Limit: Omit unless strict capping is required (e.g., sidecars).

Memory:

  • Request: Match observed usage (e.g., 512Mi if app typically uses 400Mi)
  • Limit: Set equal to request unless app explicitly releases memory (e.g., JVM GC).
resources:  
  requests:  
    cpu: "150m"  
    memory: "512Mi"  
  limits:  
    memory: "512Mi"  

Tooling

  • kubectl top: Real-time resource usage per pod.
  • Prometheus + Grafana: Historical trends and alerts for anomalies.
  • Vertical Pod Autoscaler (VPA): Test request/limit impact in non-prod clusters.
  • cgroup checker: Verify cgroups v2 support with cat /proc/self/cgroup.

Tradeoffs and Caveats

  • No limits: Risk resource starvation (noisy neighbors) but allow burst capacity.
  • Strict limits: Prevent overcommit but may throttle legitimate traffic (e.g., JVM GC pauses under memory limits).
  • JVM quirks: Pre-Java 15 or misconfigured cgroups v2 can misreport CPU/memory. Always validate with kubectl describe pod and logs.

Troubleshooting Common Issues

  • OOM kills: Check kubectl describe pod for memory limits. Use dmesg | grep -i kill to confirm OOM events.
  • CPU throttling: Run kubectl top pod and compare requested CPU to usage. High throttling indicates undersized requests.
  • cgroups v2 issues: If JVM reports incorrect CPU/memory, verify kernel version (≥5.8 recommended) and Java version (≥17 with -XX:+UseCGroupMemoryLimitForHeap flag).
  • Unexpected behavior: Test with stress-ng to simulate load and observe resource enforcement.

In production, I’ve seen teams burned by assuming JVM auto-tuning works flawlessly. Always pair resource settings with monitoring and test upgrades in staging first.

Source thread: How would you setup the resource requests and limits on this workload? (this is mostly about how different people approach it)

comments powered by Disqus