Alerts and Notifications
Configure Grafana alert rules, contact points, and notification policies for actionable incident response.
Simple Explanation (ELI5)
Alerts watch metrics and notify your team when values cross dangerous limits.
Technical Explanation
Grafana unified alerting evaluates rules and routes notifications using contact points and policies. Good alerts use stable queries, meaningful thresholds, and suppression logic to avoid noise.
Visual Section
Hands-on Commands
# CPU alert query
sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (pod) > 0.8
# Memory alert query
sum(container_memory_working_set_bytes{container!=""}) by (pod) > 1.5e+09Debugging Scenarios
- Alert never fires: threshold too high or query returns null.
- Alert flaps: add pending duration and evaluation smoothing.
- No notification delivered: contact point credentials invalid.
Real-world Use Case
A team uses Grafana alerts to detect CPU saturation and memory pressure for Kubernetes workloads and routes critical alerts to PagerDuty.
Interview Questions
Beginner
A condition evaluated periodically to determine if an alert should fire.
A configured notification destination like Slack or email.
To avoid alerting on brief spikes.
Routing logic defining which alerts go to which contact points.
Yes, via datasource queries.
Intermediate
Use better thresholds, grouping, and pending windows.
Warning is early signal, critical implies immediate impact.
Use synthetic load and verify trigger and notification path.
Speeds resolution by giving responders clear next steps.
Use labels and policy matchers for prod/stage/dev.
Scenario-based
Add pending duration and deploy-window suppression.
Contact point secret, webhook URL, and policy matching labels.
Aggregate by workload and require sustained breach.
Thresholds not workload-aware; use job-specific rules.
Alert on error rate percentage over window with service label.
Summary
Effective Grafana alerting combines clean queries, sane thresholds, and reliable routing so teams respond only to real issues.