BeginnerLesson 2 of 16

🏗️ Docker Architecture, Images and Containers

Understand Docker's client-daemon-registry model, how image layers are built and cached, and why this architecture makes builds fast and images portable.

🧒 Simple Explanation (ELI5)

Docker is like a restaurant system. The Docker CLI is the customer placing orders. The Docker Daemon is the kitchen that does all the actual cooking. The Registry (like Docker Hub) is the ingredient supplier. You just place your order — the kitchen handles everything.

💡
Why layers matter

Image layers are cached by content hash. If a layer has not changed, Docker reuses it from cache instantly. This makes rebuilds fast — often under 5 seconds for code-only changes.

🔧 Technical Architecture

Docker Client-Server Architecture
Docker CLI
docker run / build / push
REST API
Docker Daemon (dockerd)
builds, runs, manages objects
pull/push
Registry
Docker Hub / ACR

Image Layer Stack

How Layers Stack
Writable Container Layer (runtime)
COPY . /app (your source code)
RUN npm install (cached dependencies)
COPY package.json (lock file)
FROM node:18-alpine (base OS)

💻 Architecture Inspection Commands

bash
# Show all image layers and sizes
docker image history nginx:latest

# Full JSON metadata for an image (env vars, entrypoint, ports, etc.)
docker image inspect nginx:latest

# Show disk usage: images, containers, volumes, build cache
docker system df

# Docker daemon info: version, storage driver, runtime
docker info

# Prune ALL unused images (reclaim disk space)
docker image prune -a

# Prune all unused objects: containers, volumes, networks, images
docker system prune

🌍 Real-World Use Case

In a CI/CD pipeline, the base image and dependencies rarely change. Only the application code changes on every commit. By ordering Dockerfile instructions correctly (deps first, code last), the npm install layer is cached across 95% of builds, saving 2–3 minutes per pipeline run.

🐛 Debugging Scenario

Problem: Builds are slow — 3+ minutes even when changing only one line of code.

dockerfile
# BAD - copying everything first invalidates the npm install cache on every code change
FROM node:18-alpine
COPY . .
RUN npm install

# GOOD - copy only the package files first, install deps (cached), then copy code
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production     # this layer is cached until package.json changes
COPY . .                          # only this layer rebuilds on code changes

🎯 Interview Questions

What is the Docker daemon and what does it do?

The Docker daemon (dockerd) is the background service that manages all Docker objects: images, containers, volumes, and networks. The Docker CLI communicates with it over a REST API (typically a Unix socket on Linux or a named pipe on Windows). The daemon does the actual work: pulling images, building image layers, running containers, and managing their lifecycle.

Explain Docker image layers and why their order matters.

Each Dockerfile instruction that modifies the filesystem creates a new read-only layer identified by a content hash. Layers are stacked bottom-up. Changing any layer invalidates all layers above it in the cache. Order instructions from least-changing (base image, package installs) to most-changing (application source code) to maximize cache hits and minimize rebuild time.

What is the difference between a Registry, Repository, and Tag?

A Registry is the server that stores images (Docker Hub, ACR). A Repository is a named collection of related images within a registry (e.g., myapp). A Tag is a label pointing to a specific image version within a repository (e.g., myapp:1.2.3). Full format: registry/repository:tag — e.g., myacr.azurecr.io/myapp:1.2.3.

Scenario: Two developers share a Dockerfile. One gets a build cache hit, the other always gets a miss. Why?

Docker's layer cache is local by default. If Developer B never built the image locally, no layers are cached on their machine. Solutions: 1) Use registry-based cache with --cache-from registry/myapp:cache. 2) Use BuildKit inline cache: --cache-from type=registry. 3) Use Docker BuildKit with remote cache backends (S3, Registry). This is essential for CI/CD where each runner is ephemeral.

📋 Summary