Intermediate Lesson 7 of 14

ACR Integration & Image Management

Connect Azure Container Registry to AKS, build images in the cloud, manage lifecycle policies, and secure your container supply chain.

🧒 Simple Explanation (ELI5)

Think of ACR as a private warehouse for your container images. Instead of storing your images on a public shelf (Docker Hub) where anyone can see them, ACR keeps them locked in your own building. When your AKS cluster needs an image, it shows its ID badge (managed identity) to the warehouse door, and only your cluster is allowed in. You can even build the boxes (images) right inside the warehouse with az acr build — no need to bring your own construction tools (Docker Desktop).

🔧 Technical Explanation

ACR SKU Comparison

FeatureBasicStandardPremium
Storage10 GB100 GB500 GB (expandable)
Throughput (ReadOps/min)1,0003,00010,000
Webhooks210500
Geo-replicationNoNoYes
Private Link / EndpointsNoNoYes
Content Trust (signing)NoNoYes
Customer-managed keysNoNoYes
Dedicated agent poolsNoNoYes
Approximate cost/month$5$20$50+
Best forDev/testProduction (single region)Enterprise, multi-region
💡
SKU Tip

Start with Standard for production. Upgrade to Premium when you need geo-replication, private endpoints, or image signing. You can upgrade SKU in-place with no downtime.

AKS ↔ ACR Integration Methods

MethodAuthenticationSetup EffortRecommendation
az aks update --attach-acrManaged Identity (AcrPull role)One commandBest practice ✅
imagePullSecret (Service Principal)Client ID + Secret in K8s SecretMultiple steps, secret rotation neededLegacy — avoid for new clusters
ACR Admin AccountUsername + passwordSimple but insecureNever in production ❌
Recommended: Managed Identity Attach Flow
Developer
az acr build
ACR Registry
AKS pulls via
Managed Identity
Pod Running
⚠️
Never Use Admin Account in Production

The ACR admin account has full push+pull access and cannot be scoped. It's a shared credential with no audit trail. Use managed identity via --attach-acr — it's automatic, RBAC-scoped, and doesn't require secret management.

Building Images: Cloud Build vs Local Build

ApproachCommandDocker Required Locally?Best For
ACR Cloud Buildaz acr build -r myacr -t myapp:v1 .NoCI/CD, developers without Docker Desktop
Local Docker Build + Pushdocker buildaz acr logindocker pushYesLocal testing, multi-arch builds
bash
# Cloud build — no Docker needed on your machine
az acr build \
  --registry myacr \
  --image myapp:v1.2.3 \
  --image myapp:latest \
  --file ./Dockerfile \
  .

# The build context (.) is uploaded to ACR
# ACR runs the Docker build on Azure-hosted compute
# Final image is stored in myacr.azurecr.io/myapp:v1.2.3

ACR Tasks — Automated Builds

ACR Tasks automate image builds on git commit, base image updates, or on a schedule.

bash
# Create an ACR Task that builds on every commit to 'main'
az acr task create \
  --registry myacr \
  --name build-myapp \
  --image myapp:{{.Run.ID}} \
  --context https://github.com/myorg/myapp.git#main \
  --file Dockerfile \
  --git-access-token $GITHUB_PAT \
  --commit-trigger-enabled true \
  --base-image-trigger-enabled true

# This triggers a build when:
# 1. Code is pushed to 'main' branch
# 2. The base image (e.g., node:20-alpine) is updated in Docker Hub

Image Retention & Lifecycle

bash
# Set retention policy — auto-delete untagged manifests after 7 days
az acr config retention update \
  --registry myacr \
  --status enabled \
  --days 7 \
  --type UntaggedManifests

# Manually purge old images — keep only the last 10 tags
az acr run \
  --registry myacr \
  --cmd "acr purge --filter 'myapp:.*' --ago 30d --keep 10 --untagged" \
  /dev/null

Premium Features

Geo-Replication

Replicate your registry to multiple Azure regions. AKS clusters pull from the nearest replica, reducing latency and providing disaster recovery.

bash
# Add replicas to West Europe and Southeast Asia
az acr replication create --registry myacr --location westeurope
az acr replication create --registry myacr --location southeastasia

# Verify replications
az acr replication list --registry myacr -o table
# NAME            LOCATION         STATUS
# eastus          eastus           Ready
# westeurope      westeurope       Ready
# southeastasia   southeastasia    Ready

Private Endpoints

Prevent public access to ACR by exposing it only through a private endpoint in your VNet. AKS nodes communicate with ACR entirely over the Azure backbone — no internet traversal.

Private ACR + Private AKS

If you use both a private AKS cluster and a private ACR, ensure VNet peering or a shared VNet is configured so AKS nodes can reach the ACR private endpoint. Also link the ACR private DNS zone to the AKS VNet.

Content Trust (Image Signing)

Content trust ensures only signed images can be deployed. When enabled, pushers sign images with a private key, and consumers verify the signature before pulling.

Vulnerability Scanning

Microsoft Defender for Containers automatically scans images pushed to ACR for known CVEs. It generates security findings visible in Azure Security Center and can block deployment of vulnerable images via Azure Policy.

FeatureDescription
Push-time scanningScans every image immediately when pushed to ACR
Continuous rescanningRescans images weekly to catch newly disclosed CVEs
Runtime scanningScans running container images in AKS clusters
Azure Policy integrationBlock pods from using images with Critical/High CVEs

Helm Chart Storage (OCI Artifacts)

ACR supports OCI artifacts, meaning you can store Helm charts alongside container images.

bash
# Push a Helm chart to ACR as an OCI artifact
helm push mychart-0.1.0.tgz oci://myacr.azurecr.io/helm

# Install directly from ACR
helm install myrelease oci://myacr.azurecr.io/helm/mychart --version 0.1.0

Import Images from External Registries

For air-gapped or compliance-restricted environments, import public images into your private ACR.

bash
# Import NGINX from Docker Hub into ACR
az acr import \
  --name myacr \
  --source docker.io/library/nginx:1.27-alpine \
  --image nginx:1.27-alpine

# Import from another ACR (cross-subscription)
az acr import \
  --name myacr \
  --source otheracr.azurecr.io/myapp:latest \
  --image myapp:latest
💡
Docker Hub Rate Limits

Docker Hub limits anonymous pulls to 100/6h and free authenticated pulls to 200/6h. Use az acr import to pull public images into your ACR once, then reference them from ACR in your deployments. This eliminates rate limit issues and speeds up pulls.

⌨️ Hands-on

Step 1: Create an ACR

bash
# Create a Standard SKU ACR
az acr create \
  --resource-group myResourceGroup \
  --name myacr2025 \
  --sku Standard \
  --location eastus

# Verify creation
az acr show --name myacr2025 --query "{name:name, sku:sku.name, loginServer:loginServer}" -o table
# Name        Sku       LoginServer
# ----------  --------  -----------------------
# myacr2025   Standard  myacr2025.azurecr.io

Step 2: Attach ACR to AKS

bash
# Attach ACR to AKS using managed identity (best practice)
az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --attach-acr myacr2025

# Verify the attachment — check that AcrPull role is assigned
az role assignment list \
  --scope $(az acr show -n myacr2025 --query id -o tsv) \
  --query "[?roleDefinitionName=='AcrPull'].{principal:principalName, role:roleDefinitionName}" \
  -o table

# Principal                              Role
# -------------------------------------  --------
# abc12345-xxxx-xxxx-xxxx-xxxxxxxxxxxx   AcrPull

Step 3: Build an Image with ACR Cloud Build

bash
# Create a simple app to build
mkdir myapp && cd myapp

cat > Dockerfile <<'EOF'
FROM node:20-alpine
WORKDIR /app
COPY package.json .
RUN npm install --production
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]
EOF

cat > server.js <<'EOF'
const http = require('http');
const server = http.createServer((req, res) => {
  res.writeHead(200, {'Content-Type': 'application/json'});
  res.end(JSON.stringify({status: 'healthy', version: 'v1.0.0'}));
});
server.listen(3000, () => console.log('Server running on port 3000'));
EOF

cat > package.json <<'EOF'
{"name":"myapp","version":"1.0.0","main":"server.js"}
EOF

# Build in the cloud — no Docker Desktop needed
az acr build \
  --registry myacr2025 \
  --image myapp:v1.0.0 \
  --image myapp:latest \
  .

# Verify the image exists in ACR
az acr repository show-tags --name myacr2025 --repository myapp -o table
# Result
# --------
# latest
# v1.0.0

Step 4: Deploy to AKS

yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: myapp
          image: myacr2025.azurecr.io/myapp:v1.0.0
          ports:
            - containerPort: 3000
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 250m
              memory: 256Mi
bash
kubectl apply -f deployment.yaml

# Verify pods are running and pulled the image successfully
kubectl get pods -l app=myapp
# NAME                     READY   STATUS    RESTARTS   AGE
# myapp-6f8b4d5c9-2kj7x   1/1     Running   0          12s
# myapp-6f8b4d5c9-8mn3p   1/1     Running   0          12s
# myapp-6f8b4d5c9-q4r5t   1/1     Running   0          12s

# Check the image used by a pod
kubectl describe pod myapp-6f8b4d5c9-2kj7x | Select-String "Image:"
#   Image:         myacr2025.azurecr.io/myapp:v1.0.0

Step 5: Verify ACR Pull Works End-to-End

bash
# Check events for ImagePull success
kubectl get events --field-selector reason=Pulled --sort-by='.lastTimestamp'
# Successfully pulled image "myacr2025.azurecr.io/myapp:v1.0.0" in 1.2s

# Test the running application
kubectl port-forward deployment/myapp 3000:3000 &
curl http://localhost:3000
# {"status":"healthy","version":"v1.0.0"}

🐛 Debugging Scenarios

Scenario 1: ImagePullBackOff — Unauthorized

Symptom: Pods are stuck in ImagePullBackOff with error unauthorized: authentication required.

bash
# Step 1: Check the exact error
kubectl describe pod myapp-xxxx
# Events:
#   Warning  Failed  kubelet  Failed to pull image "myacr2025.azurecr.io/myapp:v1.0.0":
#   unauthorized: authentication required, visit https://aka.ms/acr/authorization

# Step 2: Verify AKS-ACR attachment
az aks check-acr \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --acr myacr2025.azurecr.io
# If this fails, the attachment is broken

# Step 3: Check if AcrPull role assignment exists
ACR_ID=$(az acr show -n myacr2025 --query id -o tsv)
KUBELET_ID=$(az aks show -g myResourceGroup -n myAKSCluster \
  --query "identityProfile.kubeletidentity.objectId" -o tsv)

az role assignment list --scope $ACR_ID --assignee $KUBELET_ID -o table

# Step 4: If role assignment is missing, re-attach
az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --attach-acr myacr2025

# Step 5: If using private ACR, verify the private endpoint is reachable
# from the AKS nodes (VNet peering, DNS resolution)
kubectl run acr-test --image=busybox:1.36 --rm -it --restart=Never -- \
  nslookup myacr2025.azurecr.io
# Should resolve to a private IP (10.x.x.x), not a public IP

# Step 6: After fixing, delete the failing pod to trigger a fresh pull
kubectl delete pod myapp-xxxx

Scenario 2: Image Not Found

Symptom: Error: manifest for myacr2025.azurecr.io/myapp:v2.0.0 not found.

bash
# Step 1: Verify the exact image name and tag in ACR
az acr repository list --name myacr2025 -o table
# Result
# ------
# myapp
# nginx

az acr repository show-tags --name myacr2025 --repository myapp -o table
# Result
# --------
# latest
# v1.0.0
# (v2.0.0 is NOT listed — the image was never pushed)

# Step 2: Check for typos in the deployment spec
kubectl get deployment myapp -o jsonpath='{.spec.template.spec.containers[0].image}'
# myacr2025.azurecr.io/myapp:v2.0.0  ← This tag doesn't exist

# Step 3: Common mistakes to check:
# - Wrong registry name (myacr vs myacr2025)
# - Wrong repository name (my-app vs myapp)
# - Tag was overwritten by retention policy
# - Image was pushed to a different ACR

# Step 4: Fix the image tag
kubectl set image deployment/myapp myapp=myacr2025.azurecr.io/myapp:v1.0.0

# Step 5: If the tag was deleted by retention, rebuild
az acr build --registry myacr2025 --image myapp:v2.0.0 .

Scenario 3: ACR Build Failed

Symptom: az acr build returns an error and the image is not in the registry.

bash
# Step 1: Check the build log output — az acr build streams logs
# Common errors:

# ERROR: "unable to prepare context: unable to evaluate symlinks"
# → The build context (.) doesn't contain the Dockerfile
# Fix: Specify the correct path
az acr build --registry myacr2025 --image myapp:v1 --file ./src/Dockerfile ./src

# ERROR: "COPY failed: file not found in build context"
# → The file referenced in COPY doesn't exist in the uploaded context
# Fix: Ensure your .dockerignore isn't excluding needed files
cat .dockerignore  # Check for overly broad exclude patterns

# ERROR: "failed to fetch anonymous token: 403 Forbidden"
# → ACR public access is disabled and you're running from outside the VNet
# Fix: Use a self-hosted agent inside the VNet, or temporarily enable public access
az acr update --name myacr2025 --public-network-enabled true

# Step 2: Check past build runs
az acr task list-runs --registry myacr2025 -o table --top 5
# RUN ID    TASK           STATUS    TRIGGER    STARTED
# -------   ----           -------   -------    -------
# cb15      build-myapp    Failed    Manual     2025-04-20T10:30:00Z

# Step 3: Get detailed logs for a specific run
az acr task logs --registry myacr2025 --run-id cb15

# Step 4: Validate Dockerfile locally before cloud build
# (if Docker is available)
docker build --no-cache -t test:local .
💡
Build Context Size

ACR cloud build uploads your entire build context. Add a .dockerignore to exclude node_modules, .git, and test files — this can reduce upload time from minutes to seconds for large repos.

🎯 Interview Questions

Beginner

Q: What is Azure Container Registry (ACR)?

ACR is a managed, private Docker registry service on Azure. It stores and manages container images and OCI artifacts (including Helm charts). ACR integrates with AKS, Azure DevOps, and GitHub Actions for CI/CD workflows. Unlike Docker Hub, ACR is private by default, supports Azure Active Directory authentication, and runs inside your Azure subscription for compliance and performance.

Q: What are the ACR SKU tiers?

Basic: 10 GB storage, cost-optimized for dev/test. Standard: 100 GB, higher throughput, suitable for production single-region. Premium: 500 GB+, adds geo-replication, private endpoints, content trust, customer-managed keys, and dedicated build agent pools. SKUs can be upgraded in-place without downtime.

Q: How do you attach ACR to AKS for image pulls?

The recommended approach is az aks update --attach-acr <acr-name>. This grants the AKS kubelet managed identity the AcrPull role on the ACR. The cluster can then pull images without secrets or manual configuration. Alternatively, you can use imagePullSecrets with a service principal, but this requires managing secret rotation and is considered legacy.

Q: What is the difference between az acr build and docker build + docker push?

az acr build uploads the build context to ACR and builds the image on Azure-hosted compute — no local Docker installation required. docker build + docker push builds locally and then pushes to ACR, requiring Docker Desktop and az acr login for authentication. Cloud build is preferred for CI/CD and environments without Docker access.

Q: Why should you avoid using the ACR admin account in production?

The admin account is a single shared credential with full push and pull access to the entire registry. It has no audit trail per user, cannot be scoped to specific repositories, and if compromised gives an attacker complete access. Managed identities (AcrPull role) provide per-cluster access with automatic credential management and RBAC-scoped permissions.

Intermediate

Q: How does ACR geo-replication work and when would you use it?

ACR Premium supports geo-replication — you add replica locations and ACR automatically syncs images to each region. AKS clusters pull from the nearest replica via Azure Traffic Manager. Use it when: 1) You have AKS clusters in multiple regions and want fast, local image pulls. 2) You need registry-level disaster recovery. 3) You want to comply with data residency requirements. A single docker push writes to all replicas. Each replica acts as a full registry with its own DNS endpoint.

Q: How do ACR Tasks differ from building images in a CI pipeline?

ACR Tasks are Azure-native build agents that run inside ACR itself. They can be triggered by git commits, base image updates, or schedules — without needing an external CI system. Advantages: no Docker-in-Docker issues, no build agent maintenance, faster because the build runs close to the registry. However, CI pipelines (Azure DevOps, GitHub Actions) offer richer workflows, testing, approvals, and multi-stage deployments. ACR Tasks are best for simple image builds; CI pipelines for full delivery workflows.

Q: How would you secure ACR with private endpoints?

1) Upgrade to Premium SKU. 2) Create a private endpoint in your VNet targeting the ACR. 3) Create a private DNS zone (privatelink.azurecr.io) and link it to your VNet. 4) Disable public network access: az acr update --public-network-enabled false. 5) AKS nodes pull images over the private endpoint via the Azure backbone. 6) For az acr build, use a dedicated agent pool inside the VNet, since cloud build agents need network access to ACR.

Q: How do you handle vulnerability scanning for container images?

Enable Microsoft Defender for Containers, which scans images: at push time (immediate), weekly (for new CVEs), and at runtime (running containers in AKS). Findings appear in Azure Security Center with severity ratings. Use Azure Policy to block deployment of images with Critical/High CVEs. For a shift-left approach, integrate Trivy or Snyk scanning in CI pipelines before pushing to ACR. Premium ACR also supports a quarantine pattern where newly pushed images are quarantined until scan results are clean.

Q: What is az acr import and when should you use it?

az acr import copies images from an external registry (Docker Hub, MCR, another ACR) into your ACR without pulling locally first. Use it for: 1) Avoiding Docker Hub rate limits by caching base images in ACR. 2) Air-gapped environments where AKS nodes can't reach public registries. 3) Cross-subscription or cross-region image replication between ACRs. 4) Seeding a new ACR with commonly used images. Import is a server-side copy — no image data passes through your client.

Scenario-Based

Q: After a cluster upgrade, all new pods fail with ImagePullBackOff but old pods still run fine. What happened?

The cluster upgrade likely rotated the kubelet managed identity or the role assignment was lost during the upgrade. Existing pods aren't affected (images are cached on the node), but new scheduling requires fresh pulls. Fix: 1) Run az aks check-acr to verify connectivity. 2) Check if the AcrPull role assignment still exists on the ACR. 3) Re-attach with az aks update --attach-acr. 4) If using imagePullSecrets, verify the secret still exists in the namespace and the service principal hasn't expired.

Q: Your team pushes 50+ images per day and ACR storage is growing fast. How do you manage image lifecycle?

1) Enable retention policy to auto-delete untagged manifests after 7 days: az acr config retention update --days 7. 2) Schedule acr purge tasks to keep only the last N tags per repository: --filter 'myapp:.*' --ago 30d --keep 10. 3) Use immutable tags for production releases so they can't be overwritten. 4) Implement a tagging strategy: :latest for dev, :v1.2.3 for releases, :sha-abc1234 for CI — purge policies target regex patterns. 5) Monitor storage with Azure Monitor alerts.

Q: You have AKS clusters in East US, West Europe, and Southeast Asia. Image pulls from a single-region ACR are slow for non-US clusters. How do you fix this?

Upgrade to ACR Premium and enable geo-replication: az acr replication create --location westeurope and --location southeastasia. Each AKS cluster pulls from the closest replica automatically. A single push replicates to all regions. This reduces pull latency from seconds to milliseconds and provides registry-level DR. The trade-off is higher cost (Premium SKU + per-replica storage), but for multi-region production deployments, it's essential.

Q: A developer accidentally pushed an image with hardcoded database connection strings. How do you respond and prevent this?

Immediate response: 1) Delete the compromised tag: az acr repository delete --name myacr --image myapp:compromised-tag. 2) Rotate the exposed database credentials immediately. 3) Check if any pods pulled and ran that image. Prevention: 1) Add secret scanning in CI (GitHub Advanced Security, GitLeaks). 2) Use multi-stage Docker builds — build stage has source, final stage only has binaries. 3) Never COPY .env files — use .dockerignore. 4) Enable ACR content trust so only signed images can be deployed. 5) Use Azure Key Vault with CSI driver for secrets in AKS — never bake secrets into images.

Q: Your CI pipeline uses az acr build but builds are failing because ACR public access is disabled. The pipeline runs on GitHub-hosted runners. How do you solve this?

Options: 1) Best: Use a self-hosted runner inside the VNet (or an Azure DevOps agent) that can reach the ACR private endpoint. 2) Use ACR dedicated agent pools (Premium feature) — build agents run inside the registry's own infrastructure, bypassing network restrictions. 3) Temporarily allow the GitHub runner IP range in ACR network rules (not recommended for production). 4) Use a service connection from the runner to Azure, and invoke az acr build with a dedicated build subnet. The self-hosted runner approach is most secure and reliable.

🌍 Real-World Use Case

Multi-Region Deployment with Geo-Replicated ACR Premium

A SaaS company serves customers across three regions: US, Europe, and APAC. Their image pipeline:

📝 Summary