Resource Requests and Limits: the Hidden Cost of Misconfiguration

Misconfigured resource requests and limits in Kubernetes can lead to wasted resources and scheduling issues.

JR

3 minute read

Misconfigured resource requests and limits in Kubernetes can lead to wasted resources and scheduling issues, requiring proactive governance and tooling to mitigate.

The Problem in Production

Developers often copy VM-era specs to container resource requests/limits, leading to grossly oversized allocations. For example:

  • A pod requesting 4 cores and 8GB memory but idling at 0.2 cores and 500MB
  • Nodes blocked by “reserved” resources while workloads starve

Scheduler only honors requests, not actual usage. This creates:

  • Resource fragmentation: Pods hog node capacity without utilizing it
  • Scheduling failures: Legitimate workloads starve despite available actual resources

Actionable Workflow to Fix This

  1. Audit current usage

    kubectl top pods --all-namespaces --wide  
    kubectl get pods --output=jsonpath='{.items[*].spec.containers[0].resources.requests}'  
    

    Compare requested vs. actual usage. Flag pods with >2x request-to-usage ratio.

  2. Enforce resource policies
    Use OpenShift’s Resource Quotas or Kyverno policies to:

    • Block pods without explicit requests/limits
    • Set max request thresholds per namespace
      Example Kyverno policy:
    apiVersion: kyverno.io/v1  
    kind: ClusterPolicy  
    metadata:  
      name: enforce-resource-limits  
    spec:  
      validationFailureAction: enforce  
      rules:  
        - name: check-resources  
          match:  
            resources:  
              kinds:  
                - Pod  
          validate:  
            message: "Pods must specify resource requests and limits"  
            pattern:  
              spec:  
                containers:  
                  - resources:  
                      requests:  
                        memory: ?  
                        cpu: ?  
                      limits:  
                        memory: ?  
                        cpu: ?  
    
  3. Monitor and adjust

    • Use Prometheus + Grafana to track request vs. usage ratios
    • Implement Vertical Pod Autoscaler (VPA) for non-critical workloads

Tooling That Helps

  • kubectl describe node: Check allocatable resources vs. actual usage
  • Prometheus: Alert on kube_pod_status_requested_memory_bytes vs container_memory_usage_bytes
  • Kyverno/OPA: Enforce organizational policies at admission
  • Vertical Pod Autoscaler: Automatically adjust requests for stateless workloads

Tradeoffs and Caveats

  • Strict policies slow deployment velocity: Balance governance with developer autonomy (e.g., allow overrides for approved exceptions)
  • VPA introduces rescheduling overhead: Only use for stable, non-critical workloads
  • Legacy apps may break: Gradually tighten policies with exemptions for critical systems

Troubleshooting Common Failures

  • Pods stuck in Pending: Check node resource availability with kubectl describe node <node>
  • False quota violations: Ensure monitoring tools account for ephemeral storage and other non-CPU/memory resources
  • Policy conflicts: Use kubectl validate (Kyverno) or kubectl adm pod-security to debug admission failures

Policy Example: Resource Governance

For a SaaS platform team, we enforced:

  • Default requests: 100m CPU / 256Mi memory for new pods
  • Max limits: 2 cores / 1Gi per container unless approved by platform team
  • Alerts: Prometheus alerts when request-to-usage ratio exceeds 3x for >1 hour

Final Note

This isn’t a developer vs. platform issue—it’s a collaboration gap. Pair policy enforcement with hands-on workshops to teach teams how to profile workloads (e.g., using kubectl top, k9s, or datadog integrations). The goal isn’t to restrict but to align resource allocation with actual needs.

Source thread: What Kubernetes feature looked great on paper but hurt you in prod?

comments powered by Disqus