Managing Kubernetes in Air-gapped Environments: Tools and Workflows

Here’s how to manage Kubernetes clusters in restricted environments using air-gapping, offline tooling, and policy enforcement.

JR

2 minute read

Here’s how to manage Kubernetes clusters in restricted environments using air-gapping, offline tooling, and policy enforcement.

Workflow: From Zero to Operational

  1. Plan your offline supply chain

    • Identify all dependencies: base OS images, operator binaries, Helm charts, and CI/CD artifacts.
    • Use skopeo or regctl to mirror images to an internal registry. Example:
      skopeo copy --all oci:quay.io/cluster-api/provider-aws:v2.30.0 oci:my-registry.local/cluster-api/provider-aws:v2.30.0  
      
    • For air-gapped clusters, pre-stage Zarf or Hauler bundles on USB drives or physical media.
  2. Deploy with offline-first tooling

    • Use Zarf for opinionated, multi-architecture deployments:
      zarf bundle deploy --file my-cluster-bundle.tar.gz --accept-terms  
      
    • For simpler use cases, Hauler moves OCI artifacts without extra cruft:
      hauler push --source ./charts --dest file:///mnt/haul  
      
  3. Enforce policies at the edge

    • Block all egress except to internal registries:
      apiVersion: networking.k8s.io/v1  
      kind: NetworkPolicy  
      metadata:  
        name: deny-external-egress  
      spec:  
        podSelector: {}  
        policyTypes:  
        - Egress  
      
    • Require images from trusted internal registries:
      apiVersion: adm.kubernetes.io/v1  
      kind: KubeletConfiguration  
      imageRegistryConfiguration:  
        mirroring:  
          - sourcePrefix: docker.io/library  
            targetPrefix: my-registry.local/library  
      

Tooling for Restricted Environments

Tool Use Case Caveats
Zarf End-to-end cluster deployment Opinionated, steep learning curve
Hauler Simple OCI artifact transfer No built-in cluster orchestration
Skopeo Image mirroring and inspection Manual process for large repos
Velero Backup/restore without cloud deps Requires local storage backend

Tradeoffs and Caveats

  • Zarf’s opinionation: Great for standardization, but custom workflows (e.g., legacy operators) may require forked bundles.
  • Hauler’s simplicity: Easier to audit but lacks Zarf’s declarative lifecycle management.
  • Local registry overhead: Mirroring 100+ GB of base images consumes disk and requires careful cleanup policies.

Troubleshooting Common Failures

  • Image pull errors:

    • Verify image hashes match between mirror and cluster:
      regctl manifest ls my-registry.local/nginx:latest  
      
    • Check registry certificates if using HTTPS:
      openssl x509 -in /etc/kubernetes/pki/front-proxy-client.crt -text -noout  
      
  • Zarf bundle failures:

    • Debug with:
      kubectl logs -n zarf <zarf-bundle-pod-name>  
      
    • Ensure all required namespaces and RBAC roles exist pre-deployment.
  • Expired certificates:

    • Rotate certificates manually or use a tool like cert-manager with a local CA.

Final Notes

Air-gapped Kubernetes is feasible but demands rigor in supply chain management and policy enforcement. Start small, automate mirror workflows, and test offline recovery paths before outages occur. In my experience, teams underestimate the operational load of maintaining internal registries—plan for at least one FTE dedicated to dependency hygiene.

Source thread: Anyone managing K8s clusters with limited or no internet access? What’s your tooling like?

comments powered by Disqus