Interview PrepLesson 16 of 16

Interview Prep: Python for DevOps

Final assessment: comprehensive Q&A covering all 15 lessons. Test your knowledge, identify gaps, and prepare for technical interviews.

🎯 Beginner-Level Questions

Core fundamentals; expected on every interview.

What are the key reasons DevOps teams use Python over Bash?

Python offers better readability, maintainable code, rich libraries (requests, Kubernetes SDK), cross-platform compatibility, and strong error handling. Bash excels at command chaining but fails at complex logic; Python scales as infrastructure automation grows.

Explain the difference between mutable and immutable types. Give examples from each category.

Mutable types (list, dict, set) can be changed after creation. Immutable types (int, str, tuple, bool) cannot. Example: `lst = [1,2]; lst[0] = 99` works. `s = "hi"; s[0] = 'x'` fails. Immutable types are safer for use as dict keys and in multi-threaded code.

What does the with statement do and why is it important?

The with statement guarantees cleanup code runs even if an exception occurs. For files: `with open(...) as f:` ensures f.close() happens automatically. This prevents file descriptor leaks. Context managers are essential for resource management—always use with for files, database connections, locks.

Write a function that retries a network call up to 3 times, waiting 2 seconds between attempts, and returns the result or None if all fail.

```python import time def retry_network_call(func, retries=3, wait=2): for i in range(retries): try: return func() except Exception as e: if i == retries - 1: return None print(f"Attempt {i+1} failed, retrying...") time.sleep(wait) ```

Explain lambda functions and when you would use them in DevOps scripts.

Lambda is an anonymous function: `lambda x: x * 2`. Use for short, throwaway operations passed to map/filter/sort. Example: sort servers by port: `sorted(servers, key=lambda s: s["port"])`. Avoid complex logic in lambda—use def for readability instead.

🎯 Intermediate-Level Questions

Applied knowledge; expected on senior roles.

How would you safely read and modify a large configuration file without risking data loss if the script crashes mid-write?

Write to a temporary file, then atomically rename it to the target. Temp files ensure incomplete writes don't corrupt the config. Pattern: 1. Read original, 2. Write to temp file, 3. Rename temp to original. If crash during 1-2, original remains untouched; rename is atomic.

Design a log parser that handles various log formats (JSON, plain text with different patterns) and groups errors by service and error type. What data structures would you use?

Use defaultdict(defaultdict(int)): outer key is service, inner key is error_type, value is count. For parsing variety: check if JSON first (json.loads), fall back to regex patterns. Regex should capture service name and error category. Output aggregated results as dict or JSON for downstream processing.

Explain exception hierarchy. Why catch specific exceptions instead of bare except?

Python exceptions inherit from BaseException. Specific catching (FileNotFoundError, TimeoutError) lets you handle each error appropriately—retry timeout, skip missing file, raise on unexpected errors. Bare except() hides bugs and is dangerous—never use it. Good pattern: specific exceptions first, generic Exception as fallback, never catch BaseException.

You need to query three APIs in sequence (each depends on previous result). Design this workflow with proper error handling and logging.

Nest API calls with try/except at each level. Log before each call + result. Example: 1. Get users (log "Fetching users..."), 2. For each user, get details, 3. For each detail, get metrics. Catch specific errors (HTTP 404=skip, 500=retry, parse error=log&skip). Chain dependencies explicitly—don't proceed if upstream fails.

When should you use subprocess vs cloud SDKs for cloud operations?

Use SDKs when available (azure-mgmt-compute, kubernetes client)—they're type-safe, handle auth/retries. Use subprocess for tools not in SDKs (kubectl, docker, terraform) or when SDK is overkill. SDKs are preferable for production; subprocess is pragmatic for CLIs and older systems.

🎯 Scenario-Based Questions

Real-world decision-making; tests judgment and trade-offs.

Your deployment script runs 100 times per day. It's slow because it makes sequential API calls to check each pod's health. How would you optimize it?

Use concurrent requests (threading/asyncio). Python's requests library + ThreadPoolExecutor: submit 20 requests in parallel, wait for results. Or use batch APIs (list all pods in one call + parse, not one call per pod). Add caching (in-memory dict with TTL) to avoid redundant checks within seconds. Profile to find bottleneck—is it network latency, parsing, or logic?

A junior team member's script hardcodes production service credentials. How do you fix this and prevent future incidents?

1. Immediately rotate credentials (they're compromised). 2. Read from environment variables: `os.getenv("API_KEY")`. 3. Use secure vaults (Azure Key Vault, AWS Secrets Manager) for production. 4. Add pre-commit hooks to catch secrets in code. 5. Code review policy: no secrets ever. Review all existing code for leaked credentials.

Your script works locally but fails in production CI/CD with "command not found" for kubectl. Diagnose and fix.

CI/CD environment has different PATH. Use absolute path: `/usr/bin/kubectl` instead of `kubectl`. Or install/setup in CI beforehand. Check environment in CI logs. Use which kubectl locally to find path, add to script. For subprocess: verify command exists before calling: `shutil.which("kubectl")` returns None if not found—handle gracefully.

You need to deploy the same infrastructure to dev, staging, and prod with slight variations (replicas, resource limits). How would you structure the code?

Config-driven design: store settings in files (YAML/JSON) per environment or a single file with env flags. Load config, validate, apply. Example: replicas = config["environments"][env]["replicas"]. Use argparse: `deploy.py deploy --env prod --replicas 5 --override`. Code is unchanged; config varies. Alternately: feature flags in code (if env == "prod": replicas=5).

Your script logs 10GB of data per day, but you only care about errors and occasional debug info. How do you balance visibility and storage?

Set log level to WARNING (skip INFO/DEBUG). For deep debugging, temporarily raise level to DEBUG. Use structured (JSON) logging and send to log aggregation (ELK, Datadog)—they index and filter server-side, storage is cheap there. Locally, rotate logs (logrotate) and compress old files. Only log important state changes, not every operation.

📊 Self-Assessment Checklist

Rate your confidence (1-5) on each topic:

Weak areas (< 3): review corresponding lesson. Strong areas (5): mentor others!

💡 Interview Tips