Advanced Lesson 10 of 14

Helm in CI/CD

Integrate Helm into automated pipelines using GitHub Actions, Helmfile, ArgoCD, and production-ready deployment patterns.

🧒 Simple Explanation (ELI5)

Right now you're running helm install from your laptop. In real teams, nobody deploys manually. Instead, a robot (CI/CD pipeline) builds your code, packages it in a container, and then uses Helm to deploy it — automatically, every time code is merged. ArgoCD takes it further: it watches your Git repo and keeps the cluster in sync.

🔧 Technical Explanation

The Production Deploy Command

bash
# The "golden" CI/CD deployment command
helm upgrade --install myapp ./chart \
  --namespace production \
  --create-namespace \
  --values values-prod.yaml \
  --set image.tag=$CI_COMMIT_SHA \
  --atomic \
  --timeout 5m \
  --wait
FlagPurpose
upgrade --installIdempotent: installs if new, upgrades if exists
--atomicAuto-rollback if upgrade fails
--waitWait for all resources to be ready
--timeoutMax wait time before marking as failed
--set image.tag=$SHAPin to exact build image

GitHub Actions Example

yaml
# .github/workflows/deploy.yml
name: Deploy with Helm
on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build & push image
        run: |
          docker build -t myregistry/myapp:${{ github.sha }} .
          docker push myregistry/myapp:${{ github.sha }}

      - name: Set up Helm
        uses: azure/setup-helm@v3
        with:
          version: v3.14.0

      - name: Configure kubeconfig
        uses: azure/k8s-set-context@v3
        with:
          method: kubeconfig
          kubeconfig: ${{ secrets.KUBECONFIG }}

      - name: Helm dependency update
        run: helm dependency update ./chart

      - name: Deploy
        run: |
          helm upgrade --install myapp ./chart \
            --namespace production \
            --values chart/values-prod.yaml \
            --set image.tag=${{ github.sha }} \
            --atomic \
            --timeout 5m

      - name: Run tests
        run: helm test myapp -n production --logs

Helmfile — Multi-Release Management

yaml
# helmfile.yaml
repositories:
  - name: bitnami
    url: https://charts.bitnami.com/bitnami

releases:
  - name: postgres
    namespace: backend
    chart: bitnami/postgresql
    version: 12.12.10
    values:
      - values/postgres.yaml

  - name: redis
    namespace: backend
    chart: bitnami/redis
    version: 17.15.0
    values:
      - values/redis.yaml

  - name: myapp
    namespace: backend
    chart: ./charts/myapp
    values:
      - values/myapp-{{ .Environment.Name }}.yaml
    set:
      - name: image.tag
        value: {{ env "IMAGE_TAG" }}

environments:
  dev:
  staging:
  production:
bash
# Deploy all releases in one command
helmfile -e production apply

# Diff before deploying (see what will change)
helmfile -e staging diff

# Sync (apply only changed releases)
helmfile -e production sync

# Destroy everything
helmfile -e dev destroy

ArgoCD + Helm (GitOps)

yaml
# ArgoCD Application manifest
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/myapp.git
    path: chart/
    targetRevision: main
    helm:
      valueFiles:
        - values-prod.yaml
      parameters:
        - name: image.tag
          value: "abc123"    # Updated by CI pipeline
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true            # Delete resources removed from chart
      selfHeal: true         # Revert manual cluster changes
    syncOptions:
      - CreateNamespace=true
GitOps Flow with Helm + ArgoCD
Developer pushes code
CI builds image + updates image.tag in Git
ArgoCD detects Git change
ArgoCD runs helm upgrade

Environment Promotion Pattern

bash
# Same chart, different values per environment
chart/
├── Chart.yaml
├── templates/
├── values.yaml            # Base defaults
├── values-dev.yaml        # Dev overrides
├── values-staging.yaml    # Staging overrides
└── values-prod.yaml       # Production overrides

# Dev deploy
helm upgrade --install myapp . -f values-dev.yaml -n dev

# Promote to staging (same chart version, different values)
helm upgrade --install myapp . -f values-staging.yaml -n staging

# Promote to production
helm upgrade --install myapp . -f values-prod.yaml -n production
💡
CI/CD Best Practices
  • Always use --atomic in pipelines — auto-rollback on failure
  • Pin chart versions and image tags (never use :latest)
  • Use helm diff plugin for review before apply
  • Store sensitive values in sealed secrets or external secret managers, not in Git
  • Run helm test after every deploy

K8s RBAC for CI/CD Service Accounts

Your CI/CD pipeline needs a Kubernetes ServiceAccount with just enough permissions to deploy — never cluster-admin.

yaml
# ci-rbac.yaml — Minimum RBAC for Helm CI/CD
apiVersion: v1
kind: ServiceAccount
metadata:
  name: helm-deployer
  namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: helm-deployer
  namespace: production
rules:
  # Helm needs to manage release secrets
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get", "list", "create", "update", "patch", "delete"]
  # Helm needs to create/update workloads
  - apiGroups: ["", "apps", "batch"]
    resources: ["pods", "deployments", "services", "configmaps",
                "serviceaccounts", "jobs", "persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  # For Ingress resources
  - apiGroups: ["networking.k8s.io"]
    resources: ["ingresses"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  # For HPA
  - apiGroups: ["autoscaling"]
    resources: ["horizontalpodautoscalers"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: helm-deployer
  namespace: production
subjects:
  - kind: ServiceAccount
    name: helm-deployer
    namespace: production
roleRef:
  kind: Role
  name: helm-deployer
  apiGroup: rbac.authorization.k8s.io
bash
# Generate a kubeconfig for the CI/CD service account
kubectl apply -f ci-rbac.yaml

# Verify the permissions are sufficient
kubectl auth can-i create deployments -n production \
  --as system:serviceaccount:production:helm-deployer
# yes

kubectl auth can-i create clusterroles \
  --as system:serviceaccount:production:helm-deployer
# no — good! Least privilege.
🔗
K8s Connection: Match RBAC to Chart Resources

Your CI/CD Role must cover every resource type in your chart templates. If you add an HPA template, add autoscaling/horizontalpodautoscalers to the Role. If you add a CronJob, add batch/cronjobs. A deployment will silently fail if the ServiceAccount lacks permissions for any resource. Run helm template . | grep "kind:" | sort -u to list all resource types your chart produces.

Practical Secrets Management with External Secrets Operator

yaml
# In your Helm chart templates:
# templates/external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: {{ include "myapp.fullname" . }}-secret
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend          # Points to your Vault/AWS/Azure secret store
    kind: ClusterSecretStore
  target:
    name: {{ include "myapp.fullname" . }}-secret
    creationPolicy: Owner
  data:
    - secretKey: DATABASE_URL
      remoteRef:
        key: myapp/production     # Path in Vault
        property: database_url
    - secretKey: STRIPE_API_KEY
      remoteRef:
        key: myapp/production
        property: stripe_key
yaml
# In your deployment template, reference the K8s Secret created by ESO:
env:
  - name: DATABASE_URL
    valueFrom:
      secretKeyRef:
        name: {{ include "myapp.fullname" . }}-secret
        key: DATABASE_URL
  - name: STRIPE_API_KEY
    valueFrom:
      secretKeyRef:
        name: {{ include "myapp.fullname" . }}-secret
        key: STRIPE_API_KEY

# The flow:
# Vault/AWS Secrets Manager → External Secrets Operator → K8s Secret → Pod env
# No secrets in Git, no secrets in values files, no --set with secrets

⌨️ Hands-on

bash
# Lab: Simulate a CI/CD pipeline locally

# Step 1: Create chart with env-specific values
helm create cicdlab
cd cicdlab

# Step 2: Create per-environment values
cat > values-dev.yaml <<EOF
replicaCount: 1
image:
  tag: "dev-latest"
resources:
  requests:
    cpu: 50m
    memory: 64Mi
EOF

cat > values-prod.yaml <<EOF
replicaCount: 3
image:
  tag: "v1.0.0"
resources:
  requests:
    cpu: 250m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 512Mi
EOF

# Step 3: "CI" — lint and validate
helm lint . -f values-dev.yaml
helm lint . -f values-prod.yaml
helm template cicdlab . -f values-prod.yaml | grep "replicas:"

# Step 4: "CD" — deploy to dev
helm upgrade --install cicdlab . \
  -f values-dev.yaml \
  -n dev --create-namespace \
  --atomic --wait --timeout 2m

# Step 5: Verify
helm list -n dev
kubectl get deploy -n dev

# Step 6: "Promote" to prod
helm upgrade --install cicdlab . \
  -f values-prod.yaml \
  -n prod --create-namespace \
  --atomic --wait --timeout 3m

# Step 7: Run tests in prod
helm test cicdlab -n prod --logs

# Cleanup
helm uninstall cicdlab -n dev
helm uninstall cicdlab -n prod

🐛 Debugging Scenarios

Scenario 1: Pipeline deploy times out

bash
# "Error: timed out waiting for the condition"
# With --atomic, it auto-rolls back

# Root causes:
# 1. Image pull failure (wrong tag, auth issue)
kubectl get events -n production --sort-by='.lastTimestamp'

# 2. Readiness probe failing
kubectl describe pod myapp-xxx -n production

# 3. Resource quota exceeded
kubectl describe resourcequota -n production

# Fix: increase timeout, fix image tag, fix probes

Scenario 2: "another operation is in progress"

bash
# A previous deploy failed/hung, leaving the release locked

# Check release status
helm status myapp -n production

# If stuck in "pending-upgrade" or "pending-install":
helm rollback myapp 0 -n production    # Rollback to last good
# Or if that fails:
kubectl delete secret -l owner=helm,name=myapp,status=pending-upgrade -n production

Scenario 3: Helm diff shows unexpected changes

bash
# Install helm-diff plugin
helm plugin install https://github.com/databus23/helm-diff

# See what will change before deploying
helm diff upgrade myapp ./chart -f values-prod.yaml -n production

# Common causes of unexpected diff:
# 1. Using --reuse-values (inherits old values, misses new defaults)
# 2. Random generated values (passwords) regenerating
# 3. Chart version bump changed default values

# Fix: Always use explicit -f values.yaml instead of --reuse-values

🎯 Interview Questions

Beginner

Q: Why use 'upgrade --install' instead of separate install/upgrade?

It's idempotent — works whether the release exists or not. First run does install, subsequent runs do upgrade. Essential for CI/CD where the pipeline doesn't know the current state. Avoids "release already exists" or "release not found" errors.

Q: What does --atomic do?

If the upgrade fails (resources don't become ready, hooks fail), Helm automatically rolls back to the previous revision. Without it, a failed upgrade leaves the release in "failed" state. Critical for CI/CD — ensures the cluster is always in a known-good state.

Q: What is Helmfile?

A declarative tool for managing multiple Helm releases. Define all releases in a helmfile.yaml, then helmfile apply installs/upgrades everything. Supports environments, templating, and diff. Like docker-compose but for Helm releases.

Q: How does ArgoCD work with Helm?

ArgoCD watches a Git repo containing Helm chart + values. When Git changes (new values, new chart version), ArgoCD automatically runs helm template and applies the diff to the cluster. This is GitOps: Git is the source of truth for cluster state.

Q: Why should you never use ':latest' tag in CI/CD?

It's not immutable — :latest points to different images over time. You can't reproduce a deployment. Rollbacks won't work (rolling back to the same tag pulls a different image). Always use the commit SHA or a semver tag as the image tag.

Intermediate

Q: What is GitOps and how does Helm fit in?

GitOps uses Git as the single source of truth for infrastructure. A controller (ArgoCD, FluxCD) watches the repo and keeps the cluster in sync. Helm fits as the templating engine — ArgoCD runs helm template to generate manifests, then applies them. The chart and values live in Git; CI updates the image tag; ArgoCD deploys.

Q: What is the helm-diff plugin and why is it important?

helm diff upgrade shows what will change before applying. Like terraform plan for Helm. Critical for review processes — teams can see the exact resource changes in a PR before approving deployment. Prevents surprises from chart upgrades or value changes.

Q: How do you handle secrets in Helm CI/CD?

Never store secrets in Git or values files. Options: 1) External Secrets Operator (syncs from Vault/AWS Secrets Manager). 2) Sealed Secrets (encrypted in Git, decrypted in-cluster). 3) Inject via CI/CD: --set dbPassword=$DB_PASSWORD from pipeline secrets. 4) Helm Secrets plugin (SOPS encryption).

Q: Why is --reuse-values problematic in CI/CD?

It merges old release values with new. If you add new defaults in values.yaml, they're ignored (old values take precedence). New chart features requiring new config won't work. Always use explicit -f values.yaml or --reset-then-reuse-values (Helm 3.14+) instead.

Q: How do you implement canary deployments with Helm?

Options: 1) Two releases: stable (90% traffic) and canary (10%), managed by Istio/Linkerd traffic splitting. 2) Argo Rollouts: replaces Deployment with Rollout resource, automates canary steps. 3) Flagger: watches Helm releases and automates progressive delivery with metrics analysis.

Scenario-Based

Q: Your CI/CD pipeline deploys, but the app is broken. How do you handle it?

If using --atomic: automatic rollback happened. If not: helm rollback myapp <prev-revision> -n prod. Check what changed: helm diff revision myapp N-1 N. Review logs: kubectl logs, events: kubectl get events. Fix the issue, push to Git, let pipeline redeploy.

Q: Two team members deploy simultaneously and one gets "another operation in progress." Fix?

Implement deploy locking: 1) CI/CD level — only one deploy job runs at a time (GitHub Actions concurrency groups). 2) If stuck: check helm status, rollback or delete the pending release secret. 3) Long-term: use ArgoCD (single controller, no concurrent deploys).

Q: You need to deploy the same app to 50 clusters. How?

Options: 1) ArgoCD ApplicationSet — generates an Application per cluster from a template. 2) Helmfile with cluster-specific values files. 3) CI/CD matrix strategy — parallel deploy jobs per cluster. All use the same chart, different values (cluster endpoint, region, replicas).

Q: Your chart upgrade requires a manual approval before production. How?

1) GitHub Actions: use environment: production with required reviewers. 2) ArgoCD: disable auto-sync for prod, require manual sync click. 3) Add a helm diff step that posts the diff to a PR for review. 4) Use deployment gates in Azure DevOps or similar.

Q: How do you ensure chart changes don't break existing deployments?

1) CI: helm lint, helm template, kubeconform, chart-testing (ct). 2) Deploy to staging first (same chart, similar values). 3) helm diff in PR for review. 4) helm test after deploy. 5) Feature flags via values (condition on new features). 6) Semver versioning — major version bump for breaking changes.

🌍 Real-World Use Case

A fintech company's deployment pipeline:

📝 Summary

← Back to Helm Course