Hands-on Lesson 14 of 14

Interview Preparation

Top 25+ Kubernetes interview questions with detailed answers, scenario-based challenges, and architecture explanation practice.

💡

How to Use This Page

Click any question to reveal the answer. Try answering out loud first, then compare. For scenario questions, think through the debugging steps before revealing the solution.

Section 1: Core Concepts (Must-Know)

Q1: What is Kubernetes and why is it used?▼

Kubernetes is an open-source container orchestration platform originally developed by Google. It automates deployment, scaling, and management of containerized applications. Key reasons to use it: Self-healing (restarts crashed containers), Auto-scaling (horizontal pod autoscaler), Service discovery & load balancing (via Services), Rolling updates & rollbacks (zero-downtime deployments), Secret & config management, Storage orchestration. It solves the problem of running containers reliably in production at scale.

Q2: Explain the Kubernetes architecture.▼

Control Plane: API Server (central hub, handles all REST requests), etcd (distributed key-value store for cluster state), Scheduler (assigns pods to nodes based on resources/constraints), Controller Manager (runs reconciliation loops — Deployment controller, ReplicaSet controller, etc.). Worker Nodes: kubelet (agent that ensures containers run per pod specs), kube-proxy (manages networking rules for Service traffic), Container Runtime (containerd/CRI-O — runs the actual containers). All communication goes through the API Server. etcd is the single source of truth.

Q3: What is the difference between a Pod and a Container?▼

A Container is a single runnable image instance. A Pod is the smallest deployable unit in Kubernetes — it wraps one or more containers that share the same network namespace (same IP, localhost communication), storage volumes, and lifecycle. Pods are ephemeral; they don't self-heal. You almost never create pods directly — you use Deployments which manage ReplicaSets which manage Pods.

Q4: What is a Deployment? How is it different from a ReplicaSet?▼

A ReplicaSet ensures a specified number of identical pod replicas are running. A Deployment manages ReplicaSets and adds rolling updates and rollback capabilities. When you update a Deployment, it creates a new ReplicaSet, scales it up, and scales the old one down (rolling update). You should almost always use Deployments, not standalone ReplicaSets. Hierarchy: Deployment → ReplicaSet → Pods.

Q5: Explain the different types of Kubernetes Services.▼

ClusterIP (default): Internal-only IP. Used for pod-to-pod communication within the cluster. NodePort: Exposes the service on a static port (30000-32767) on every node's IP. Accessible externally via NodeIP:NodePort. LoadBalancer: Provisions a cloud load balancer (AWS ALB, Azure LB) that routes to the service. Used in production for external traffic. ExternalName: Maps a service to an external DNS name (CNAME). No proxying involved.

Q6: What is a Namespace and when would you use it?▼

Namespaces provide logical isolation within a cluster. They partition resources, allow different teams or environments (dev, staging, prod) to coexist in one cluster with isolation. Use cases: multi-tenancy, resource quotas per team, RBAC per namespace, environment separation. Default namespaces: default, kube-system, kube-public, kube-node-lease. Note: Cluster-scoped resources (nodes, PVs, ClusterRoles) are not namespaced.

Q7: What is etcd and why is it critical?▼

etcd is a distributed, strongly consistent key-value store that stores the entire Kubernetes cluster state — all resource definitions, configurations, secrets, and metadata. If etcd is lost and there's no backup, the cluster is effectively destroyed. It uses the Raft consensus algorithm for distributed consistency. Best practices: run 3+ replicas for HA, regular automated backups (etcdctl snapshot save), encrypt at rest, restrict access to only the API server.

Section 2: Configuration & Security

Q8: ConfigMap vs Secret — what's the difference?▼

Both store configuration data as key-value pairs. ConfigMap: For non-sensitive data (app settings, feature flags). Stored in plain text. Secret: For sensitive data (passwords, API keys, TLS certs). Base64-encoded by default (not encrypted unless you enable encryption at rest). Both can be consumed as environment variables or mounted as files. In production, use external secret stores (Vault, AWS Secrets Manager) via CSI drivers for true secret security.

Q9: Explain RBAC in Kubernetes.▼

RBAC controls who can do what in the cluster. Four objects: Role (namespace-scoped permissions), ClusterRole (cluster-scoped), RoleBinding (binds Role to users/groups/SAs in a namespace), ClusterRoleBinding (binds ClusterRole cluster-wide). Example: a Role with verbs: [get, list] on resources: [pods] lets the bound subject view pods but not modify them. Follow least privilege: never give cluster-admin to service accounts.

Q10: What is a Service Account?▼

A ServiceAccount provides an identity for pods to authenticate against the Kubernetes API. Each namespace has a default SA. When a pod needs to interact with the API (e.g., CI/CD tools, operators), it uses a ServiceAccount. The SA's permissions are defined by RBAC. Best practices: don't use the default SA for apps, create dedicated SAs with minimal RBAC; set automountServiceAccountToken: false when API access isn't needed.

Q11: What are Network Policies?▼

Network Policies control pod-to-pod and pod-to-external network traffic at L3/L4 (IP/port level). By default, all pods can communicate freely. Network policies restrict this. They use label selectors to match pods, define ingress/egress rules, and require a CNI plugin that supports them (Calico, Cilium, Azure CNI). Best pattern: default-deny everything, then explicitly allow required traffic paths.

Section 3: Scaling & Operations

Q12: Explain Horizontal Pod Autoscaler (HPA).▼

HPA automatically scales pod replicas based on observed metrics (CPU utilization, memory, or custom metrics). It checks metrics every 15 seconds (configurable), computes the desired replica count using: desiredReplicas = ceil[currentReplicas × (currentMetric / targetMetric)], and scales the Deployment. Requires metrics-server for CPU/memory metrics, or Prometheus adapter for custom metrics. Has cooldown periods to prevent flapping. Configure minReplicas and maxReplicas to set bounds.

Q13: What is the difference between liveness, readiness, and startup probes?▼

Liveness probe: "Is the container alive?" If it fails, kubelet kills and restarts the container. Catches deadlocks, hangs. Readiness probe: "Is the container ready to accept traffic?" If it fails, the pod is removed from Service endpoints (no traffic routed). Container continues running. Useful during startup or temporary heavy load. Startup probe: "Has the app finished starting?" Disables liveness/readiness probes until it succeeds. Used for slow-starting applications. Prevents premature liveness kills.

Q14: What is an Ingress and how is it different from a Service?▼

A Service provides L4 (TCP/UDP) load balancing. An Ingress provides L7 (HTTP/HTTPS) routing — host-based routing, path-based routing, TLS termination. Ingress requires an Ingress Controller (NGINX, Traefik, AWS ALB). One Ingress resource can route traffic to multiple services. Example: api.example.com/users → user-service, api.example.com/orders → order-service. It consolidates multiple services behind a single entry point.

Q15: What are Rolling Updates and how do rollbacks work?▼

Rolling Update: The default strategy. Creates new pods with the updated spec, then terminates old pods gradually. Controlled by maxSurge (extra pods during update) and maxUnavailable (pods that can be down). Set maxUnavailable=0 for zero-downtime. Rollback: Kubernetes keeps revision history. kubectl rollout undo deployment/<name> reverts to previous version. kubectl rollout undo --to-revision=2 goes to a specific revision. kubectl rollout history shows all revisions.

Q16: What is a DaemonSet?▼

A DaemonSet ensures that a copy of a pod runs on every node (or specific nodes via nodeSelector). Use cases: log collectors (Fluentd), monitoring agents (Prometheus node-exporter), network plugins (Calico, kube-proxy). When new nodes join the cluster, the DaemonSet automatically schedules pods on them. When nodes are removed, those pods are garbage collected.

Q17: What is a StatefulSet and when do you use it?▼

StatefulSets manage stateful applications that need: stable network identities (pod-0, pod-1, pod-2 — predictable DNS names), stable persistent storage (each pod gets its own PVC), ordered deployment/scaling (pod-0 starts before pod-1). Used for databases (MySQL, PostgreSQL), message queues (Kafka, RabbitMQ), distributed systems (Elasticsearch, ZooKeeper). Unlike Deployments, pods are not interchangeable.

Section 4: Scenario-Based Questions

Q18: A pod is in CrashLoopBackOff. Walk me through your debugging process.▼

Systematic approach:

kubectl get pods — confirm CrashLoopBackOff, note RESTARTS count
kubectl logs <pod> — check current crash output
kubectl logs <pod> --previous — check previous crash if no current logs
kubectl describe pod <pod> — check Events, Last State, Exit Code
Exit code analysis: 1=app error, 137=OOMKilled, 126=permission, 127=command not found
If OOMKilled: increase memory limits
If app error: fix the code, missing config, or missing env vars
If command not found: check the container image and entrypoint
Last resort: run the same image interactively with overridden entrypoint to debug

Q19: You deployed a new version and users are getting errors. What do you do?▼

Immediate response:

Rollback immediately: kubectl rollout undo deployment/<name> — this restores the previous working version in seconds
Check rollout status: kubectl rollout status deployment/<name>
Verify users are restored: check service endpoints, test the endpoint

Then investigate:

Check the failed pods from the bad version: kubectl logs
Describe the deployment: kubectl rollout history deployment/<name> --revision=<n>
Root cause: wrong image, bad config, dependency unavailable?
Fix the issue, test in staging, then redeploy

Prevention: Use readiness probes (bad pods won't receive traffic), staged rollouts (canary), maxUnavailable=0.

Q20: How would you design a Kubernetes cluster for a production microservices application?▼

Cluster design:

HA Control Plane: 3+ master nodes across availability zones. Managed K8s (EKS/AKS/GKE) if possible
Worker Nodes: Separate node pools — general (mixed workloads), memory-optimized (databases), spot/preemptible (batch jobs)
Namespaces: Per-team or per-service. ResourceQuotas and LimitRanges on each

Workload config:

Deployments with resource requests/limits, health probes, pod disruption budgets
HPA on each service with appropriate metrics
Anti-affinity rules to spread pods across nodes/zones

Networking: Ingress controller + cert-manager for TLS. Network policies for pod isolation. Service mesh (Istio/Linkerd) for observability.

Security: RBAC per namespace. Network policies. Pod security standards. External secrets. Image scanning. OPA/Kyverno policies.

Observability: Prometheus + Grafana for metrics. EFK/Loki for logs. Jaeger/Zipkin for tracing.

Q21: A service can receive traffic but pods behind it are responding slowly. How do you investigate?▼

kubectl top pods — check CPU/memory usage. Are pods maxing out?
kubectl describe pod — check resource limits. Is CPU being throttled?
kubectl logs — look for slow queries, timeouts, or errors
Check pod distribution: kubectl get pods -o wide — are all pods on one overloaded node?
Check HPA: is it scaling? kubectl get hpa
Check downstream dependencies: is the database slow? External API timing out?
Fixes: Increase resources, scale up replicas, add pod anti-affinity for distribution, optimize the application code, add caching

Q22: How do you handle secrets securely in Kubernetes?▼

K8s built-in (minimum):

Enable encryption at rest for etcd (EncryptionConfiguration)
Use RBAC to restrict who can read secrets
Never commit secrets to Git — use sealed-secrets or SOPS for GitOps

Production (recommended):

External secret stores: AWS Secrets Manager, Azure Key Vault, HashiCorp Vault
Integration via: CSI Secret Store Driver, External Secrets Operator, or Vault Agent sidecar
Rotate secrets automatically
Audit secret access via K8s audit logs

Never: Store secrets in ConfigMaps, environment variables in CI/CD logs, or unencrypted YAML in Git.

Q23: You need zero-downtime deployments. How do you achieve this?▼

Rolling update strategy: Set maxUnavailable: 0 and maxSurge: 1 (or 25%)
Readiness probes: New pods only receive traffic after they're ready
Pod Disruption Budgets (PDB): Ensure minimum pods are available during voluntary disruptions (node drains)
Graceful shutdown: Handle SIGTERM in your app. Set terminationGracePeriodSeconds appropriately. Use preStop hooks if needed
Connection draining: Readiness probe should fail immediately on SIGTERM so new connections go to other pods while existing ones finish
Multiple replicas: Always run 2+ replicas spread across nodes/zones

Q24: What is the difference between a Deployment and a DaemonSet?▼

Deployment: Runs N replicas of a pod. Scheduler decides which nodes. Used for application workloads (web servers, APIs, workers). Scales horizontally.

DaemonSet: Runs exactly one pod per node (or per selected nodes). Used for node-level services: monitoring agents, log collectors, network plugins. Automatically adds/removes pods as nodes join/leave.

Key difference: Deployment = "I need N copies somewhere." DaemonSet = "I need exactly one on every node."

Q25: How does Kubernetes DNS work?▼

CoreDNS runs as a Deployment in kube-system. Every pod's /etc/resolv.conf points to the CoreDNS service IP. DNS records are auto-created for Services:

<service-name>.<namespace>.svc.cluster.local → Service ClusterIP
Within the same namespace, just <service-name> works
Headless services (clusterIP: None) return individual pod IPs
StatefulSet pods get: <pod-name>.<service-name>.<namespace>.svc.cluster.local

DNS is the backbone of service discovery in Kubernetes. If CoreDNS is down, inter-service communication breaks.

Section 5: Architecture Explanation Practice

Interviewers often ask you to draw/explain the K8s architecture or a deployment flow on a whiteboard. Practice explaining these:

Practice 1: "Explain what happens when you run 'kubectl apply -f deployment.yaml'"

Request Flow

kubectl

→ REST

API Server

→ store

etcd

API Server

→ notify

Deployment Controller

→ creates

ReplicaSet

ReplicaSet Controller

→ creates

Pod (unscheduled)

→

Scheduler

→ assign node

kubelet

💡

The full flow

1) kubectl sends YAML to API Server. 2) API Server authenticates → authorizes (RBAC) → admission controls → validates → stores in etcd. 3) Deployment Controller sees new Deployment, creates ReplicaSet. 4) ReplicaSet Controller creates Pod objects. 5) Scheduler sees unscheduled pods, picks the best node, updates etcd. 6) kubelet on that node sees assigned pods, pulls image via container runtime, starts container. 7) kube-proxy updates iptables/IPVS rules for Service routing.

Practice 2: "How does traffic reach your application?"

Explain: User → DNS → Load Balancer → Ingress Controller Pod → Service (kube-proxy / iptables) → Pod. Cover: L7 routing at Ingress, L4 at Service, pod selection via labels, endpoint slice updates when pods change.

Practice 3: "How do you ensure high availability?"

Cover: Multiple replicas across nodes/zones (pod anti-affinity), Pod Disruption Budgets, readiness probes, rolling updates with maxUnavailable=0, multi-AZ node pools, HA control plane (3+ masters), etcd backups, cluster autoscaler for capacity.

📋 Interview Tips

Say "I would check…" then list commands — interviewers want to see your thought process, not just the answer
Draw diagrams — for architecture questions, sketch the components and arrows
Mention trade-offs — "NodePort is simpler but LoadBalancer is better for production because…"
Connect to experience — "In my project, we used HPA because…"
Admit what you don't know — "I haven't used StatefulSets in production, but I understand they provide…"
Security is always a good answer — mentioning RBAC, least privilege, and secrets management shows maturity

📝 Summary

Core concepts (Pods, Deployments, Services, Namespaces) are asked in every interview
Scenario-based questions test your debugging methodology — always describe a systematic approach
Architecture questions test your understanding of component interactions — practice drawing the K8s architecture
Security awareness (RBAC, secrets, network policies) separates good answers from great ones
Real experience matters — relate answers back to your hands-on practice

🎉

Congratulations!

You've completed the entire Kubernetes Zero to Hero course. Go back and review any topics you're less confident about, practice the hands-on labs, and you'll be interview-ready. Good luck!

← PreviousDebugging Scenarios 🏠 FinishBack to Course

← Back to Kubernetes Course