IntermediateLesson 5 of 11

Data Collection (Scraping)

Learn scrape jobs, exporters, target discovery, scrape intervals, relabeling, and how Prometheus actually collects metrics in production.

Simple Explanation (ELI5)

Scraping means Prometheus regularly visits each target and reads its metric page. If the target responds, Prometheus saves the numbers. If it does not, Prometheus marks the target as down.

Real-world Analogy

A janitor checks every room in a building on a fixed schedule. If a room is locked, the janitor notes that. Scraping works the same way: Prometheus follows a route, checks each endpoint, and records whether it could collect data.

Technical Explanation

Prometheus uses scrape_configs to define jobs. Each job can include static targets or dynamic service discovery. Exporters expose metrics for systems like Linux, Redis, MySQL, or black-box HTTP probing. Relabeling transforms target labels before scraping or storage.

ConceptPurposeExample
scrape_intervalHow often to collect15s for apps, 60s for low-change systems
scrape_timeoutHow long to wait10s timeout on slow targets
job_nameGroups targets logicallynode-exporter, kubernetes-pods
relabel_configsRewrite or keep/drop labelsKeep only annotated pods
metric_relabel_configsFilter metrics after scrapeDrop high-cardinality labels

Visual Representation

Service Discovery
Scrape Configs
Intervals / Paths / Labels
Targets Up or Down

Commands / Syntax

yaml
scrape_configs:
  - job_name: "node-exporter"
    scrape_interval: 15s
    static_configs:
      - targets: ["node1:9100", "node2:9100"]

  - job_name: "kubernetes-pods"
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
bash
# Check discovered targets
curl http://localhost:9090/api/v1/targets
curl http://localhost:9090/api/v1/targets/metadata

# Inspect a metrics endpoint directly
curl http://node1:9100/metrics | head

# Kubernetes service and pod checks
kubectl get pods -A -o wide
kubectl get svc -A
kubectl describe pod my-app-123 -n prod

Example (Real-world Use Case)

A cluster uses node exporter on each node, kube-state-metrics for object state, cAdvisor-derived container metrics, and custom application metrics from services annotated for Prometheus scraping. Scrape intervals are shorter for API latency metrics and longer for less dynamic batch systems.

Hands-on Section

  1. Add a scrape job for a test app exposing /metrics.
  2. Verify the target appears under Status → Targets.
  3. Break the port intentionally and confirm the target turns DOWN.
  4. Restore the port and observe recovery.

Try It Yourself

Debugging Scenarios

Metrics Not Collected

If metrics are missing, start by checking target discovery and endpoint accessibility. PromQL is usually not the first problem in a scrape failure.

Interview Questions

Beginner

What is scraping in Prometheus?

Scraping is when Prometheus periodically requests metrics from a target endpoint over HTTP.

What is a scrape job?

A scrape job is a logical configuration block that defines how Prometheus collects metrics from one set of targets.

What is an exporter?

An exporter is a process that exposes metrics in Prometheus format for a system that does not do so natively.

Where do you see if a target is up or down?

In the Prometheus UI under Status → Targets or via the up metric.

Why does Prometheus need target discovery?

Because in dynamic environments like Kubernetes, targets constantly change and cannot be managed with static IP lists alone.

Intermediate

What is the difference between relabeling and metric relabeling?

Relabeling changes target metadata before scraping. Metric relabeling changes or drops metrics after scrape but before storage.

How do scrape interval and scrape timeout interact?

Timeout must be shorter than the interval. If timeout is too close to interval, slow targets can cause scrape instability.

Why might you not scrape every target every 5 seconds?

Short intervals increase ingestion cost, network load, and storage. Use higher frequency only where fast detection matters.

How is scraping done in Kubernetes without hardcoding pod IPs?

Through Kubernetes service discovery or Prometheus Operator resources like ServiceMonitor and PodMonitor.

Why would you use blackbox exporter?

To probe endpoints externally for reachability, latency, DNS resolution, or HTTP success instead of relying only on internal app metrics.

Scenario-based

A service exposes metrics locally, but Prometheus still cannot scrape it. What do you inspect?

I inspect network path, service or pod label matching, scrape path, port name, ServiceMonitor selectors, and namespace scoping.

Node exporter metrics disappeared after a node replacement. Why?

The new node may not have node exporter running, or discovery labels changed. In Kubernetes, daemonset health and node scheduling are common causes.

Prometheus scrapes succeed, but some metrics are missing. What is a likely cause?

Metric relabel configs may be dropping them, or the exporter may disable some collectors by default.

Your Kubernetes pods are being scraped twice. What causes that?

Multiple overlapping scrape configs or both annotation-based scraping and ServiceMonitor-based scraping targeting the same endpoint.

A security team blocks pod-to-pod traffic. What monitoring impact do you expect?

Prometheus may fail to scrape app pods depending on topology and network policies, so targets go down even if apps themselves are healthy.

Summary

Scraping is where Prometheus earns or loses trust. Good scrape configs, correct discovery, and sane relabeling produce reliable monitoring. Bad target selection or path configuration produces silence, which is worse than noise.