Hands-on Lesson 12 of 14

Break & Fix Challenges

Intentionally break Helm deployments, then diagnose and fix them. The best way to learn debugging is to cause the bugs yourself.

๐Ÿง’ Simple Explanation (ELI5)

Doctors learn surgery by practicing on simulations, not just reading textbooks. This lab gives you controlled failure scenarios โ€” you'll deliberately inject broken values, bad templates, wrong versions, and failed upgrades, then fix each one. When these happen in production, you'll know exactly what to do.

๐Ÿ”ง Setup

bash
# Create a base chart for all challenges
helm create breakfix
cd breakfix

# Install a working baseline
helm install breakfix . -n bf --create-namespace
kubectl get pods -n bf  # Should be READY 1/1

# Every challenge starts from this working state

๐Ÿ”ด Challenge 1: Wrong Image Tag

๐Ÿ”ด
The Break

Deploy with a non-existent image tag.

bash
# Break it โ€” use a tag that doesn't exist
helm upgrade breakfix . -n bf --set image.tag=v999.999.999

# Observe the failure
kubectl get pods -n bf
kubectl describe pod -n bf | grep -A 3 "Events"
# โ†’ ErrImagePull / ImagePullBackOff
๐Ÿ’š Fix (click to reveal)
bash
# Option 1: Rollback to last working revision
helm rollback breakfix 1 -n bf

# Option 2: Upgrade with correct tag
helm upgrade breakfix . -n bf --set image.tag="1.25-alpine"

# Verify
kubectl get pods -n bf  # READY 1/1

๐Ÿ”ด Challenge 2: Invalid YAML in values

๐Ÿ”ด
The Break

Pass malformed values that break template rendering.

bash
# Create a broken values file
cat > broken-values.yaml <<EOF
replicaCount: "not-a-number"
service:
  port: abc
EOF

# Try to install
helm upgrade breakfix . -n bf -f broken-values.yaml
# โ†’ Error rendering templates
๐Ÿ’š Fix (click to reveal)
bash
# Step 1: Identify the error
helm template test . -f broken-values.yaml 2>&1
# Shows exactly which template and line failed

# Step 2: Fix the values
cat > fixed-values.yaml <<EOF
replicaCount: 2
service:
  port: 80
EOF

# Step 3: Re-deploy with fixed values
helm upgrade breakfix . -n bf -f fixed-values.yaml

# Verify
kubectl get pods -n bf

๐Ÿ”ด Challenge 3: Template Syntax Error

๐Ÿ”ด
The Break

Introduce a Go template syntax error.

bash
# Add a broken template
cat > templates/broken.yaml <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ include "breakfix.fullname" . }}-broken
data:
  value: {{ .Values.missing.nested.key }}
  bad_syntax: {{ if .Values.enabled }
EOF

# Try to render
helm template test .
# โ†’ "unexpected "}" in if" or nil pointer error
๐Ÿ’š Fix (click to reveal)
bash
# Fix 1: Use default for missing values
# Fix 2: Close the if block properly

cat > templates/broken.yaml <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ include "breakfix.fullname" . }}-fixed
data:
  value: {{ .Values.missing | default "fallback" | quote }}
  enabled: {{ if .Values.enabled }}"yes"{{ else }}"no"{{ end }}
EOF

# Verify
helm template test .
helm lint .

๐Ÿ”ด Challenge 4: Stuck in pending-upgrade

๐Ÿ”ด
The Break

Simulate an interrupted upgrade that leaves the release in a broken state.

bash
# Simulate: upgrade with a very short timeout and a bad image
helm upgrade breakfix . -n bf \
  --set image.repository=invalid/image \
  --timeout 10s \
  --wait
# โ†’ Fails, release might be in "pending-upgrade" or "failed"

# Check status
helm status breakfix -n bf
helm list -n bf  # STATUS column shows "failed" or "pending-upgrade"

# Try another upgrade โ€” may get:
# "Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress"
๐Ÿ’š Fix (click to reveal)
bash
# Option 1: Rollback to last successful revision
helm history breakfix -n bf    # Find the last "deployed" revision
helm rollback breakfix 1 -n bf

# Option 2: If rollback won't work (rare)
# Delete the pending release secret
kubectl get secrets -n bf -l owner=helm,status=pending-upgrade
kubectl delete secret sh.helm.release.v1.breakfix.v3 -n bf

# Then retry the upgrade with correct values
helm upgrade breakfix . -n bf

# Verify clean state
helm list -n bf   # STATUS: deployed

๐Ÿ”ด Challenge 5: Values Not Being Applied

๐Ÿ”ด
The Break

You set values but they don't show up in the deployed resources.

bash
# Create a custom values file with a typo in the key name
cat > my-values.yaml <<EOF
replicacount: 5
service:
  Port: 9090
EOF

# Apply (note: these are case-sensitive!)
helm upgrade breakfix . -n bf -f my-values.yaml

# Check โ€” still 1 replica, still port 80!
kubectl get deploy -n bf -o jsonpath='{.items[0].spec.replicas}'
kubectl get svc -n bf -o jsonpath='{.items[0].spec.ports[0].port}'
๐Ÿ’š Fix (click to reveal)
bash
# The issue: YAML keys are case-sensitive!
# "replicacount" โ‰  "replicaCount"
# "Port" โ‰  "port"

# Check what values Helm actually used:
helm get values breakfix -n bf --all | grep -i replica
helm get values breakfix -n bf --all | grep -i port

# Fix: Use correct case
cat > my-values.yaml <<EOF
replicaCount: 5
service:
  port: 9090
EOF

helm upgrade breakfix . -n bf -f my-values.yaml

# Verify
kubectl get deploy -n bf -o jsonpath='{.items[0].spec.replicas}'  # โ†’ 5
kubectl get svc -n bf -o jsonpath='{.items[0].spec.ports[0].port}'  # โ†’ 9090

๐Ÿ”ด Challenge 6: Missing Dependency

๐Ÿ”ด
The Break

Add a dependency but forget to download it.

bash
# Add dependency to Chart.yaml
cat >> Chart.yaml <<EOF

dependencies:
  - name: redis
    version: "17.15.0"
    repository: "https://charts.bitnami.com/bitnami"
EOF

# Try to install WITHOUT running dependency update
helm upgrade breakfix . -n bf
# โ†’ "Error: found in Chart.yaml, but missing in charts/ directory: redis"
๐Ÿ’š Fix (click to reveal)
bash
# Step 1: Add the repo (if not already)
helm repo add bitnami https://charts.bitnami.com/bitnami

# Step 2: Download dependencies
helm dependency update .

# Step 3: Verify
ls charts/      # redis-17.15.0.tgz
helm dependency list .

# Step 4: Re-deploy
helm upgrade breakfix . -n bf

๐Ÿ”ด Challenge 7: Name Collision

๐Ÿ”ด
The Break

Try to install a release with an existing name in a different namespace.

bash
# Install in namespace bf (already exists)
# Try installing same release name in bf2
helm install breakfix . -n bf2 --create-namespace
# This actually works! Release names are scoped to namespace.

# But try installing in the SAME namespace:
helm install breakfix . -n bf
# โ†’ "Error: INSTALLATION FAILED: cannot re-use a name that is still in use"
๐Ÿ’š Fix (click to reveal)
bash
# Option 1: Use a different release name
helm install breakfix-v2 . -n bf

# Option 2: Upgrade the existing release
helm upgrade breakfix . -n bf

# Option 3: Use upgrade --install (idempotent)
helm upgrade --install breakfix . -n bf

# Key insight: helm upgrade --install is almost always
# preferred over helm install in scripts and CI/CD

๐Ÿ”ด Challenge 8: RBAC Permission Denied

๐Ÿ”ด
The Break

Deploy as a user/service account that lacks permissions on the target namespace.

bash
# Create a limited service account
kubectl create namespace restricted
kubectl create serviceaccount helm-deployer -n restricted

# Create a Role with ONLY get/list (no create/update)
kubectl create role viewer --verb=get,list --resource=pods,deployments,services -n restricted
kubectl create rolebinding helm-deployer-viewer --role=viewer \
  --serviceaccount=restricted:helm-deployer -n restricted

# Try to deploy with this limited context
# (simulate by using --as flag if your cluster supports it)
helm install breakfix . -n restricted \
  --set serviceAccount.name=helm-deployer
# โ†’ Error: create: failed to create: deployments.apps is forbidden:
#   User cannot create resource "deployments" in namespace "restricted"
๐ŸŸข Fix (click to reveal)
bash
# Step 1: Check what permissions are needed
kubectl auth can-i create deployments -n restricted --as system:serviceaccount:restricted:helm-deployer
# no

# Step 2: Create a proper Role with Helm's minimum permissions
kubectl create role helm-deploy \
  --verb=get,list,watch,create,update,patch,delete \
  --resource=pods,deployments,services,configmaps,secrets,serviceaccounts \
  -n restricted

kubectl create rolebinding helm-deployer-deploy \
  --role=helm-deploy \
  --serviceaccount=restricted:helm-deployer \
  -n restricted

# Step 3: Verify permissions
kubectl auth can-i create deployments -n restricted \
  --as system:serviceaccount:restricted:helm-deployer
# yes

# Step 4: Retry deploy
helm install breakfix . -n restricted

# Key insight: Helm needs create/update/delete on ALL resource types
# that appear in your chart templates

๐Ÿ”ด Challenge 9: Resource Quota Exceeded

๐Ÿ”ด
The Break

Deploy to a namespace with tight resource quotas that block pod creation.

bash
# Create a namespace with very tight quotas
kubectl create namespace quota-demo
kubectl apply -n quota-demo -f - <<EOF
apiVersion: v1
kind: ResourceQuota
metadata:
  name: tight-quota
spec:
  hard:
    requests.cpu: "100m"
    requests.memory: "128Mi"
    limits.cpu: "200m"
    limits.memory: "256Mi"
    pods: "2"
EOF

# Deploy requesting more than the quota allows
helm install breakfix . -n quota-demo \
  --set replicaCount=3 \
  --set resources.requests.cpu=200m \
  --set resources.requests.memory=256Mi \
  --wait --timeout 30s
# โ†’ Pods stuck in Pending, helm eventually times out
๐ŸŸข Fix (click to reveal)
bash
# Step 1: Find why pods are pending
kubectl get events -n quota-demo --sort-by='.lastTimestamp'
# "exceeded quota: tight-quota, requested: requests.cpu=200m, 
#  used: requests.cpu=0, limited: requests.cpu=100m"

# Step 2: Check current quota usage
kubectl describe resourcequota tight-quota -n quota-demo

# Step 3: Fix by reducing resource requests to fit the quota
helm upgrade --install breakfix . -n quota-demo \
  --set replicaCount=1 \
  --set resources.requests.cpu=50m \
  --set resources.requests.memory=64Mi \
  --set resources.limits.cpu=100m \
  --set resources.limits.memory=128Mi

# Verify
kubectl get pods -n quota-demo
kubectl describe resourcequota tight-quota -n quota-demo

# Key insight: ResourceQuotas are enforced at the K8s API level.
# Helm won't warn you โ€” pods just fail to schedule.
# Use values.schema.json to validate resource requests at helm time.

๐Ÿงน Full Cleanup

bash
helm uninstall breakfix -n bf 2>/dev/null
helm uninstall breakfix -n bf2 2>/dev/null
helm uninstall breakfix-v2 -n bf 2>/dev/null
kubectl delete ns bf bf2 2>/dev/null
cd .. && rm -rf breakfix broken-values.yaml fixed-values.yaml my-values.yaml

๐Ÿ“ Summary

โ† Back to Helm Course