Panels and Queries
Choose correct panel types and write PromQL for CPU, memory, and request monitoring.
Simple Explanation (ELI5)
Panels are the widgets on a dashboard; queries decide what data each widget shows.
Technical Explanation
Use time series for trends, stat for current values, table for breakdowns, and heatmap for latency distributions. Query quality matters more than panel cosmetics.
Visual Section
Time series
Stat panel
Table/bar gauge
Hands-on Commands
# CPU usage by pod
sum by (pod) (rate(container_cpu_usage_seconds_total{container!=""}[5m]))
# Memory by pod
sum by (pod) (container_memory_working_set_bytes{container!=""})
# Requests per second
sum(rate(http_requests_total[5m]))
# 5xx error rate
sum(rate(http_requests_total{status=~"5.."}[5m]))Debugging Scenarios
- No data in panel: wrong metric name or label filter.
- Negative spikes on counter graph: querying raw counter instead of rate.
- Duplicate lines: label dimensions too granular.
Real-world Use Case
A service dashboard with CPU, memory, request rate, and 5xx panels quickly isolated a noisy canary release.
Interview Questions
Beginner
Time series panel.
To convert cumulative counts into per-second rates.
Stat panel.
Table/bar chart with sort and topk.
Wrong query, datasource, labels, or time window.
Intermediate
Aggregate early and avoid unbounded label dimensions.
For bucketed distributions like request latency.
Traffic, error rate, and latency together for RED method.
Prevents misreading bytes as MB, seconds as ms, etc.
Use environment variable and label-based filtering.
Scenario-based
Likely wrong metric source or label mismatch excluding pods.
Check status label mapping and whether errors are recorded in metric.
Not always; correlate with restarts and sustained growth.
Expensive query across many label combinations.
Inspect variable value and panel query namespace label filter.
Summary
Good Grafana outcomes depend on matching panel type to metric behavior and writing efficient PromQL.