Deploy with a non-existent image tag.
Break & Fix Challenges
Intentionally break Helm deployments, then diagnose and fix them. The best way to learn debugging is to cause the bugs yourself.
🧒 Simple Explanation (ELI5)
Doctors learn surgery by practicing on simulations, not just reading textbooks. This lab gives you controlled failure scenarios — you'll deliberately inject broken values, bad templates, wrong versions, and failed upgrades, then fix each one. When these happen in production, you'll know exactly what to do.
🔧 Setup
# Create a base chart for all challenges helm create breakfix cd breakfix # Install a working baseline helm install breakfix . -n bf --create-namespace kubectl get pods -n bf # Should be READY 1/1 # Every challenge starts from this working state
🔴 Challenge 1: Wrong Image Tag
# Break it — use a tag that doesn't exist helm upgrade breakfix . -n bf --set image.tag=v999.999.999 # Observe the failure kubectl get pods -n bf kubectl describe pod -n bf | grep -A 3 "Events" # → ErrImagePull / ImagePullBackOff
💚 Fix (click to reveal)
# Option 1: Rollback to last working revision helm rollback breakfix 1 -n bf # Option 2: Upgrade with correct tag helm upgrade breakfix . -n bf --set image.tag="1.25-alpine" # Verify kubectl get pods -n bf # READY 1/1
🔴 Challenge 2: Invalid YAML in values
Pass malformed values that break template rendering.
# Create a broken values file cat > broken-values.yaml <<EOF replicaCount: "not-a-number" service: port: abc EOF # Try to install helm upgrade breakfix . -n bf -f broken-values.yaml # → Error rendering templates
💚 Fix (click to reveal)
# Step 1: Identify the error helm template test . -f broken-values.yaml 2>&1 # Shows exactly which template and line failed # Step 2: Fix the values cat > fixed-values.yaml <<EOF replicaCount: 2 service: port: 80 EOF # Step 3: Re-deploy with fixed values helm upgrade breakfix . -n bf -f fixed-values.yaml # Verify kubectl get pods -n bf
🔴 Challenge 3: Template Syntax Error
Introduce a Go template syntax error.
# Add a broken template
cat > templates/broken.yaml <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "breakfix.fullname" . }}-broken
data:
value: {{ .Values.missing.nested.key }}
bad_syntax: {{ if .Values.enabled }
EOF
# Try to render
helm template test .
# → "unexpected "}" in if" or nil pointer error
💚 Fix (click to reveal)
# Fix 1: Use default for missing values
# Fix 2: Close the if block properly
cat > templates/broken.yaml <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "breakfix.fullname" . }}-fixed
data:
value: {{ .Values.missing | default "fallback" | quote }}
enabled: {{ if .Values.enabled }}"yes"{{ else }}"no"{{ end }}
EOF
# Verify
helm template test .
helm lint .
🔴 Challenge 4: Stuck in pending-upgrade
Simulate an interrupted upgrade that leaves the release in a broken state.
# Simulate: upgrade with a very short timeout and a bad image helm upgrade breakfix . -n bf \ --set image.repository=invalid/image \ --timeout 10s \ --wait # → Fails, release might be in "pending-upgrade" or "failed" # Check status helm status breakfix -n bf helm list -n bf # STATUS column shows "failed" or "pending-upgrade" # Try another upgrade — may get: # "Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress"
💚 Fix (click to reveal)
# Option 1: Rollback to last successful revision helm history breakfix -n bf # Find the last "deployed" revision helm rollback breakfix 1 -n bf # Option 2: If rollback won't work (rare) # Delete the pending release secret kubectl get secrets -n bf -l owner=helm,status=pending-upgrade kubectl delete secret sh.helm.release.v1.breakfix.v3 -n bf # Then retry the upgrade with correct values helm upgrade breakfix . -n bf # Verify clean state helm list -n bf # STATUS: deployed
🔴 Challenge 5: Values Not Being Applied
You set values but they don't show up in the deployed resources.
# Create a custom values file with a typo in the key name
cat > my-values.yaml <<EOF
replicacount: 5
service:
Port: 9090
EOF
# Apply (note: these are case-sensitive!)
helm upgrade breakfix . -n bf -f my-values.yaml
# Check — still 1 replica, still port 80!
kubectl get deploy -n bf -o jsonpath='{.items[0].spec.replicas}'
kubectl get svc -n bf -o jsonpath='{.items[0].spec.ports[0].port}'
💚 Fix (click to reveal)
# The issue: YAML keys are case-sensitive!
# "replicacount" ≠ "replicaCount"
# "Port" ≠ "port"
# Check what values Helm actually used:
helm get values breakfix -n bf --all | grep -i replica
helm get values breakfix -n bf --all | grep -i port
# Fix: Use correct case
cat > my-values.yaml <<EOF
replicaCount: 5
service:
port: 9090
EOF
helm upgrade breakfix . -n bf -f my-values.yaml
# Verify
kubectl get deploy -n bf -o jsonpath='{.items[0].spec.replicas}' # → 5
kubectl get svc -n bf -o jsonpath='{.items[0].spec.ports[0].port}' # → 9090
🔴 Challenge 6: Missing Dependency
Add a dependency but forget to download it.
# Add dependency to Chart.yaml
cat >> Chart.yaml <<EOF
dependencies:
- name: redis
version: "17.15.0"
repository: "https://charts.bitnami.com/bitnami"
EOF
# Try to install WITHOUT running dependency update
helm upgrade breakfix . -n bf
# → "Error: found in Chart.yaml, but missing in charts/ directory: redis"
💚 Fix (click to reveal)
# Step 1: Add the repo (if not already) helm repo add bitnami https://charts.bitnami.com/bitnami # Step 2: Download dependencies helm dependency update . # Step 3: Verify ls charts/ # redis-17.15.0.tgz helm dependency list . # Step 4: Re-deploy helm upgrade breakfix . -n bf
🔴 Challenge 7: Name Collision
Try to install a release with an existing name in a different namespace.
# Install in namespace bf (already exists) # Try installing same release name in bf2 helm install breakfix . -n bf2 --create-namespace # This actually works! Release names are scoped to namespace. # But try installing in the SAME namespace: helm install breakfix . -n bf # → "Error: INSTALLATION FAILED: cannot re-use a name that is still in use"
💚 Fix (click to reveal)
# Option 1: Use a different release name helm install breakfix-v2 . -n bf # Option 2: Upgrade the existing release helm upgrade breakfix . -n bf # Option 3: Use upgrade --install (idempotent) helm upgrade --install breakfix . -n bf # Key insight: helm upgrade --install is almost always # preferred over helm install in scripts and CI/CD
🔴 Challenge 8: RBAC Permission Denied
Deploy as a user/service account that lacks permissions on the target namespace.
# Create a limited service account kubectl create namespace restricted kubectl create serviceaccount helm-deployer -n restricted # Create a Role with ONLY get/list (no create/update) kubectl create role viewer --verb=get,list --resource=pods,deployments,services -n restricted kubectl create rolebinding helm-deployer-viewer --role=viewer \ --serviceaccount=restricted:helm-deployer -n restricted # Try to deploy with this limited context # (simulate by using --as flag if your cluster supports it) helm install breakfix . -n restricted \ --set serviceAccount.name=helm-deployer # → Error: create: failed to create: deployments.apps is forbidden: # User cannot create resource "deployments" in namespace "restricted"
🟢 Fix (click to reveal)
# Step 1: Check what permissions are needed kubectl auth can-i create deployments -n restricted --as system:serviceaccount:restricted:helm-deployer # no # Step 2: Create a proper Role with Helm's minimum permissions kubectl create role helm-deploy \ --verb=get,list,watch,create,update,patch,delete \ --resource=pods,deployments,services,configmaps,secrets,serviceaccounts \ -n restricted kubectl create rolebinding helm-deployer-deploy \ --role=helm-deploy \ --serviceaccount=restricted:helm-deployer \ -n restricted # Step 3: Verify permissions kubectl auth can-i create deployments -n restricted \ --as system:serviceaccount:restricted:helm-deployer # yes # Step 4: Retry deploy helm install breakfix . -n restricted # Key insight: Helm needs create/update/delete on ALL resource types # that appear in your chart templates
🔴 Challenge 9: Resource Quota Exceeded
Deploy to a namespace with tight resource quotas that block pod creation.
# Create a namespace with very tight quotas
kubectl create namespace quota-demo
kubectl apply -n quota-demo -f - <<EOF
apiVersion: v1
kind: ResourceQuota
metadata:
name: tight-quota
spec:
hard:
requests.cpu: "100m"
requests.memory: "128Mi"
limits.cpu: "200m"
limits.memory: "256Mi"
pods: "2"
EOF
# Deploy requesting more than the quota allows
helm install breakfix . -n quota-demo \
--set replicaCount=3 \
--set resources.requests.cpu=200m \
--set resources.requests.memory=256Mi \
--wait --timeout 30s
# → Pods stuck in Pending, helm eventually times out
🟢 Fix (click to reveal)
# Step 1: Find why pods are pending kubectl get events -n quota-demo --sort-by='.lastTimestamp' # "exceeded quota: tight-quota, requested: requests.cpu=200m, # used: requests.cpu=0, limited: requests.cpu=100m" # Step 2: Check current quota usage kubectl describe resourcequota tight-quota -n quota-demo # Step 3: Fix by reducing resource requests to fit the quota helm upgrade --install breakfix . -n quota-demo \ --set replicaCount=1 \ --set resources.requests.cpu=50m \ --set resources.requests.memory=64Mi \ --set resources.limits.cpu=100m \ --set resources.limits.memory=128Mi # Verify kubectl get pods -n quota-demo kubectl describe resourcequota tight-quota -n quota-demo # Key insight: ResourceQuotas are enforced at the K8s API level. # Helm won't warn you — pods just fail to schedule. # Use values.schema.json to validate resource requests at helm time.
🧹 Full Cleanup
helm uninstall breakfix -n bf 2>/dev/null helm uninstall breakfix -n bf2 2>/dev/null helm uninstall breakfix-v2 -n bf 2>/dev/null kubectl delete ns bf bf2 2>/dev/null cd .. && rm -rf breakfix broken-values.yaml fixed-values.yaml my-values.yaml
📝 Summary
- Wrong image →
helm rollbackor fix the tag and upgrade - Template errors →
helm templateto see the exact error - Stuck release →
helm rollbackor delete pending secrets - Values not applied → Check case-sensitivity, run
helm get values - Missing dependency →
helm dependency update - Name collision → Use
helm upgrade --installalways - RBAC denied → Check
kubectl auth can-i, grant minimum required permissions - Quota exceeded →
kubectl describe resourcequota, reduce requests or increase quota