Artifacts & Caching
Speed up workflows with dependency caching and share build outputs between jobs using artifacts.
🧒 Simple Explanation (ELI5)
Imagine it's moving day.
- Artifacts are like labeled boxes you pack up and store in the warehouse. You put test reports, build outputs, and coverage files into boxes with clear labels. Later, other people (other jobs) can go to the warehouse, find the right box by name, and unpack it. You can even drive to the warehouse yourself (the GitHub UI) and download a box. The warehouse keeps your boxes for 90 days by default, then recycles them.
- Caching is like keeping your tools in a shed near the job site instead of driving to the hardware store every single morning. The first day you buy all the tools (install dependencies), but you leave them in the shed overnight. Next morning, you check the shed — if the tools are still there and your shopping list hasn't changed, you skip the hardware store entirely and go straight to work. If the list changed (lockfile updated), you make one quick trip and restock the shed.
Artifacts move outputs between jobs. Caching keeps dependencies between runs. Together, they turn a slow, repetitive workflow into a fast, efficient one.
📦 Artifacts
Artifacts let you persist data after a job completes and share it with other jobs in the same workflow — or download it later from the GitHub UI. They're ideal for build outputs, test reports, coverage files, and logs.
Upload & Download Actions
actions/upload-artifact@v4— uploads files from a job to GitHub's artifact storageactions/download-artifact@v4— downloads previously uploaded artifacts into a job
What to Upload
- Test results and coverage reports (JUnit XML, lcov, Cobertura)
- Build binaries and compiled assets (
dist/,.jar,.exe) - Logs and diagnostic output for debugging failed runs
- Docker images exported as tarballs (
docker save)
Retention & Limits
- Default retention: 90 days (configurable per-upload via
retention-days) - Org/repo settings can override the maximum retention period
- Artifacts count against your GitHub Actions storage quota
Sharing Between Jobs
The most common pattern: a build job uploads the compiled output, and a deploy job (which needs: the build job) downloads it. This avoids rebuilding the same code in every job.
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm ci && npm run build
- uses: actions/upload-artifact@v4
with:
name: dist
path: dist/
retention-days: 7
deploy:
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/download-artifact@v4
with:
name: dist
- run: ls -la dist/
Every artifact uploaded during a workflow run is available on the run's Summary page under the Artifacts section. Click the artifact name to download a ZIP file — handy for grabbing test reports or build outputs without re-running the workflow.
Within a single workflow run, each artifact must have a unique name. If two jobs upload artifacts with the same name, the second upload will fail. Use dynamic names (e.g., test-results-${{ matrix.os }}) when running matrix builds.
⚡ Dependency Caching
Caching stores files (like node_modules/ or pip wheels) between workflow runs so you don't re-download them every time. The actions/cache@v4 action is the core building block.
How It Works
key— unique identifier for the cache entry. If an exact match is found, the cache is restoredrestore-keys— fallback prefixes. If no exact match, the most recent cache matching a prefix is restored (partial hit)path— directory or files to cache
Cache Key Strategies
The best cache keys include the OS, the package manager name, and a hash of the lockfile. When the lockfile changes, a new cache is created. When it doesn't, you get an instant restore.
| Ecosystem | Cache Key Pattern |
|---|---|
| Node.js | ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }} |
| Python | ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }} |
| Go | ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }} |
| .NET | ${{ runner.os }}-nuget-${{ hashFiles('**/*.csproj') }} |
Cache Limits
- 10 GB per repository — total across all cache entries
- LRU eviction — when the limit is reached, the least recently used caches are deleted first
- Caches not accessed within 7 days are automatically evicted
Setup Actions with Built-in Caching
Many official setup actions have a cache parameter that handles caching automatically — no need for a separate actions/cache step:
# Built-in caching — one line does it all
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
This is equivalent to manually configuring the cache, but much simpler. The action automatically determines the correct path and cache key.
Explicit Cache (Full Control)
When you need more control — custom paths, fallback keys, or caching something the setup action doesn't support — use actions/cache@v4 directly:
- uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-npm-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-npm-
When the exact key doesn't match, restore-keys provides prefix-based fallback. For example, Linux-npm- would match any previous cache for Linux npm, even if the lockfile hash differs. This gives you a "stale but close" cache that still saves significant download time — npm only fetches the diff.
📊 Artifacts vs Cache — Comparison
| Feature | Artifacts | Cache |
|---|---|---|
| Purpose | Share outputs between jobs; download results | Speed up dependency installation across runs |
| Lifetime | 1–90 days (configurable) | 7 days since last access; LRU eviction |
| Size limit | Counts against Actions storage quota | 10 GB per repository |
| Cross-workflow | Not shared between workflows (per-run) | Shared across all workflows in the repo |
| Cross-job | Yes — upload in one job, download in another | Yes — saved on completion, restored on start |
| Downloadable from UI | Yes — ZIP download from run summary | No — only restored within workflow runs |
| Typical use cases | Build binaries, test reports, coverage, logs | node_modules, pip packages, Go modules, Docker layers |
🐳 Docker Layer Caching
Docker builds can be painfully slow when every layer is rebuilt from scratch. GitHub Actions supports GHA cache backend for Docker BuildKit, which caches individual layers and only rebuilds what changed.
cache-from / cache-to with GHA Backend
cache-from: type=gha— pull cached layers from GitHub Actions cachecache-to: type=gha,mode=max— push all layers (including intermediate) to cachemode=maxcaches every layer;mode=minonly caches the final image layers
Example with docker/build-push-action
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
context: .
push: true
tags: myregistry.azurecr.io/myapp:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
Docker layer caching can reduce image build times from 5+ minutes to under 30 seconds when only application code changes (base image and dependency layers are cached). This is especially impactful for large images with heavy system dependencies.
⏱️ Performance Impact
Here's a typical before-and-after when adding caching and artifacts to a CI workflow:
BEFORE (no caching, no artifacts — 8 min total)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
install deps ████████████████ 3 min
build ████████████ 2 min
test ████████ 1.5 min
docker build ████████ 1.5 min
AFTER (with caching + artifacts — 3 min total)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
restore cache ██ 0.3 min (npm cache hit)
build ████████████ 2 min
test ████████ 1.5 min (parallel with build via artifacts)
docker build ██ 0.3 min (layer cache hit)
─── saved ~5 min (62%) ───
The biggest wins come from npm/pip cache restores (skipping dependency downloads) and Docker layer caching (skipping unchanged layers). Artifact sharing between jobs also enables parallelism — test can start as soon as build uploads its artifacts.
🛠️ Hands-on Lab
Lab 1: Upload Test Results as an Artifact
- Create a workflow with a
testjob that runsnpm test -- --reporters=junit(or any test framework that outputs JUnit XML) - Add
actions/upload-artifact@v4to upload the test results directory - Run the workflow and download the artifact from the run's Summary page
- Verify the XML files are present in the downloaded ZIP
Lab 2: Add npm Caching
- Add
cache: 'npm'to youractions/setup-node@v4step - Run the workflow twice — observe the first run says "Cache not found" and the second says "Cache restored"
- Compare the
npm ciduration: first run (full install) vs second run (cached) - Modify
package-lock.jsonslightly and run again — observe a cache miss and new cache save
Lab 3: Compare Workflow Times
- Create two workflow files:
no-cache.yml(no caching) andwith-cache.yml(npm + Docker caching) - Trigger both on the same commit using
workflow_dispatch - Compare total run times on the Actions tab
- Document the time savings in a comment on the PR
Lab 4: Artifact Sharing Between Jobs
- Create a workflow with two jobs:
buildanddeploy - In
build, compile the app and uploaddist/as an artifact - In
deploy, useneeds: buildand download the artifact - Verify the deploy job has access to the build output without re-compiling
Check the Post actions/cache step in your workflow logs. It will report "Cache hit" (exact match), "Cache restored from key" (prefix fallback), or "Cache not found." The actions/cache action also sets a cache-hit output you can use in conditional steps.
🐛 Debugging Common Issues
"Cache miss every time"
- Key too specific: If your key includes a timestamp, commit SHA, or frequently changing value, every run generates a new key and never matches. Stick to lockfile hashes
- Hash mismatch: The
hashFiles()pattern might not match your actual lockfile location. Use**/package-lock.jsonto search recursively, or provide the exact path - Branch isolation: Caches are scoped to branches. A cache saved on
mainis available to feature branches, but a cache saved onfeature-xis not available onmainor other branches - 7-day eviction: If no workflow accesses the cache for 7 days, it's automatically deleted
"Artifact not found"
- Name mismatch: The
nameinupload-artifactmust exactly match thenameindownload-artifact. This is case-sensitive - Different workflow: Artifacts are scoped to a single workflow run. You cannot download artifacts from a different workflow or a previous run using
download-artifactalone (use the REST API for that) - Job ordering: Ensure the downloading job has
needs:pointing to the uploading job — otherwise they may run in parallel and the artifact won't exist yet - Upload step failed: Check if the upload step actually succeeded — if the path doesn't match any files, the upload is silently skipped
"Cache size exceeded"
- Trim cached paths: Cache only what's needed —
~/.npminstead ofnode_modules/(npm ci can reconstruct from the cache) - Prune old caches: Use
gh actions-cache listandgh actions-cache deleteto manually clean up stale caches - Avoid caching build outputs: Use artifacts for build outputs; reserve cache for dependency files that are expensive to download but rarely change
- Monitor usage: Check your repo's Actions cache usage under Settings → Actions → Caches
🎯 Interview Questions
Basic (5)
1. What is an artifact in GitHub Actions?
An artifact is a file or collection of files produced during a workflow run that can be persisted and shared between jobs. You upload artifacts using actions/upload-artifact and download them with actions/download-artifact or from the GitHub UI.
2. What is the purpose of caching in CI/CD workflows?
Caching stores dependencies (like node_modules or pip packages) between workflow runs so they don't need to be re-downloaded every time. This significantly reduces workflow execution time, especially for projects with large dependency trees.
3. How long are artifacts retained by default?
By default, artifacts are retained for 90 days. You can override this per-upload using the retention-days parameter, and organization/repository settings can enforce maximum retention periods.
4. What's the difference between artifacts and caches?
Artifacts share outputs between jobs within a run (build binaries, test reports) and are downloadable from the UI. Caches persist dependencies between runs (npm packages, pip wheels) to speed up installations. They have different lifetimes, size limits, and use cases.
5. How do you enable built-in caching in actions/setup-node?
Add cache: 'npm' (or 'yarn' or 'pnpm') to the with: block. The action automatically determines the cache path and generates a key from the lockfile hash.
Intermediate (5)
6. Explain how restore-keys works as a fallback mechanism.
When the exact key doesn't match any cached entry, restore-keys provides prefix-based fallback. GitHub searches for the most recent cache whose key starts with the given prefix. This gives you a "stale but close" cache — the package manager then only downloads the difference, which is much faster than a full install.
7. What is the cache size limit per repository, and how is eviction handled?
The limit is 10 GB per repository across all cache entries. When the limit is reached, GitHub uses LRU (Least Recently Used) eviction — the caches accessed longest ago are deleted first. Caches not accessed within 7 days are also automatically evicted.
8. How do you share build outputs between jobs without rebuilding?
The build job uploads the output directory as an artifact using actions/upload-artifact. Downstream jobs declare needs: build and use actions/download-artifact to retrieve the output. This avoids redundant compilation across jobs.
9. Why is caching ~/.npm preferred over caching node_modules/?
~/.npm is the npm cache directory containing downloaded tarballs. npm ci uses this cache to avoid network downloads but still creates a clean node_modules/ from the lockfile. Caching node_modules/ directly can lead to stale or inconsistent dependencies if the lockfile changes.
10. What happens when an artifact upload step matches no files?
By default, actions/upload-artifact@v4 will warn but not fail if the path matches no files (depending on the if-no-files-found parameter). You can set if-no-files-found: error to make the step fail explicitly, which is recommended so you don't silently lose artifacts.
Senior (5)
11. Design a cache key strategy for a monorepo with multiple services using different package managers.
Use a composite key that includes the service name, OS, package manager, and lockfile hash: ${{ runner.os }}-<service>-npm-${{ hashFiles('services/<service>/package-lock.json') }}. Each service gets its own cache entry, so changes to one service don't invalidate another's cache. Use restore-keys with the service prefix for partial hits. For shared dependencies, consider a separate cache entry for the root lockfile.
12. How does Docker layer caching with type=gha work, and what's the difference between mode=min and mode=max?
The GHA cache backend stores Docker BuildKit layers in GitHub Actions cache. mode=min caches only the layers of the final stage — useful for simple Dockerfiles. mode=max caches all layers including intermediate multi-stage build stages — essential for multi-stage builds where base/dependency stages rarely change. mode=max uses more cache space but provides much better hit rates for complex Dockerfiles.
13. A workflow's cache hit rate dropped from 95% to 10% after a refactor. What do you investigate?
Check: (1) Did lockfile paths change? hashFiles() may no longer match. (2) Did the repo structure change (monorepo reorganization)? The glob pattern may need updating. (3) Were cache keys refactored to include new variables that change frequently? (4) Did branch strategy change? Caches are branch-scoped. (5) Check gh actions-cache list to see what caches exist and their keys. (6) Verify restore-keys still provide valid prefixes.
14. How would you manage the 10 GB cache limit in a large repository with many workflows?
Audit existing caches with gh actions-cache list --sort size. Trim cached paths to essentials (cache ~/.npm not node_modules/). Use granular keys so stale entries get evicted naturally. Implement a scheduled workflow that runs gh actions-cache delete for caches older than a threshold. Avoid caching build outputs (use artifacts instead). For Docker, use mode=min if mode=max consumes too much space. Consider splitting large caches by concern (deps vs build tools).
15. Explain how you'd implement an end-to-end CI pipeline that uses both artifacts and caching optimally.
Structure the pipeline in parallel jobs connected by artifacts: (1) Install job — restore npm cache, run npm ci, upload node_modules as artifact. (2) Lint, Test, Build jobs run in parallel, each downloading the artifact. (3) Build uploads dist/ as artifact. (4) Docker job downloads dist/, builds image with GHA layer cache. (5) Deploy job runs after all checks pass. Caching handles cross-run dependency persistence; artifacts handle cross-job data sharing within a run. This maximizes parallelism while minimizing redundant work.
🏭 Real-World Scenario
A team's CI pipeline for a React + Node.js monorepo was taking 12 minutes per push. Every job re-installed 800+ npm packages from scratch, rebuilt the Docker image from the base layer, and the test job re-compiled the app before running tests.
After optimization:
- npm caching with
actions/setup-node+cache: 'npm'reduced dependency install from 3 min to 15 seconds - Docker layer caching with
cache-from: type=ghacut image build from 4 min to 40 seconds (only theCOPY . .layer changed) - Artifact sharing — the build job uploads
dist/, and both the test and deploy jobs download it instead of rebuilding. This also enabled running tests in parallel with deployment prep - Explicit cache keys with
restore-keysfallback ensured even branch-first runs got partial cache hits frommain
Result: 12 minutes → 4 minutes — a 67% reduction. The team estimates this saves ~200 developer-hours per month in waiting time across 50+ daily pushes. The cache hit rate stabilized at 93%, and artifact storage costs remained negligible thanks to 7-day retention on non-essential uploads.
📝 Summary
- Artifacts (
upload-artifact/download-artifact) persist build outputs between jobs and are downloadable from the GitHub UI — default retention is 90 days - Caching (
actions/cache) stores dependencies between runs using key/restore-keys matching — 10 GB limit per repo with LRU eviction - Cache keys should combine OS + package manager + lockfile hash for optimal hit rates with
restore-keysprefix fallback - Setup actions like
actions/setup-nodeoffer built-incache:for zero-config caching - Docker layer caching with
cache-from: type=gha/cache-to: type=gha,mode=maxdramatically speeds up image builds - Artifacts share data between jobs within a run; caches persist data between runs — use both for maximum efficiency
- Cache
~/.npm(notnode_modules/), use unique artifact names in matrix builds, and monitor cache usage to stay within limits - A well-optimized pipeline combining caching + artifacts + Docker layer cache can cut CI times by 50–70%