Understanding Memory Sharing Between Pods on Kubernetes Nodes

Pods share node memory via Linux cgroups and namespaces, enforcing resource limits to prevent contention.

JR

2 minute read

Pods share node memory via Linux cgroups and namespaces, enforcing resource limits to prevent contention.

When multiple workloads run on a shared node, memory pressure often stems from misconfigured resource requests or noisy neighbors. Here’s how to diagnose, fix, and prevent these issues in production.


How Memory Sharing Works

Kubernetes uses cgroups (control groups) to partition memory and namespaces to isolate pods. Each container in a pod is assigned a cgroup with:

  • Requests: Guaranteed minimum memory.
  • Limits: Maximum memory before OOM (out-of-memory) kill.

If a pod exceeds its limit, the kernel OOM killer terminates it. If total node memory is exhausted, the kubelet evicts pods (starting with those exceeding requests).


Actionable Workflow

  1. Check current resource usage:
    kubectl top pods --all-namespaces  
    kubectl top nodes  
    
  2. Inspect pod resource configurations:
    kubectl describe pod <pod-name> | grep -E "Memory|Requests"  
    
  3. Monitor node pressure:
    kubectl describe node <node-name> | grep "Memory Pressure"  
    
  4. Adjust requests/limits: Start with requests ≈ 70% of typical usage, limits ≈ 1.5x peak usage.

Policy Example

Enforce memory constraints via a ResourceQuota:

apiVersion: v1  
kind: ResourceQuota  
metadata:  
  name: memory-constraint  
spec:  
  hard:  
    memory: "4Gi"  
    requests.memory: "1Gi"  
    limits.memory: "2Gi"  

Apply per-namespace to cap total memory and prevent runaway consumption.


Tooling

  • kubectl: For real-time metrics and pod inspection.
  • Prometheus + Grafana: Track historical memory usage and node pressure.
  • cAdvisor: Low-level container metrics (access via kubectl port-forward on kubelet).
  • kube-state-metrics: Expose pod/node resource data for alerting.

Tradeoff: Overcommit vs. Safety

Kubernetes allows memory overcommit (sum of requests < node memory), but this risks OOM kills under contention. Example:

  • Safe: Set requests = limits (no overcommit).
    • Caveat: Wastes memory if workloads are bursty.
  • Aggressive: Overcommit 2:1 (e.g., 8Gi requested on 4Gi node).
    • Caveat: Higher risk of evictions during spikes.

Balance based on workload predictability.


Troubleshooting Common Issues

  1. Pod OOM kills:
    • Check kubectl describe pod for OOMKilled: true.
    • Increase limits or optimize application memory usage.
  2. Node memory pressure:
    • Look for MemoryPressure True in kubectl describe node.
    • Scale down workloads or add nodes.
  3. Noisy neighbors:
    • Use kubectl top to identify pods consuming excessive memory.
    • Enforce quotas or move to dedicated nodes.

Prevention

  • Baseline monitoring: Ensure metrics are collected automatically (e.g., Prometheus scrapes kube-state-metrics).
  • Resource requests as code: Define requests/limits in Helm charts or Kustomize overlays.
  • Scheduling affinity: Isolate memory-intensive workloads to specific nodes.

Understand that memory sharing is a kernel-level concern—cgroups enforce policies, but you must define them. Misconfigured defaults (e.g., no limits) often lead to production fires. Start small, monitor, and iterate.

Source thread: new to K8s — how do pods actually share memory on a node?

comments powered by Disqus